Language Learning as Language Use: Statistically-based Chunking in Development
While usage-based approaches to language development enjoy considerable support from computational studies, there have been few attempts to answer a key computational challenge posed by usage-based theory: the successful modeling of language learning as language use. I present a usage-based computational model of language acquisition which learns in a purely incremental fashion, through on-line processing based on chunking, and which offers broad, cross-linguistic coverage while uniting comprehension and production processes within a single framework. The model's design reflects memory constraints imposed by the real-time nature of language processing, and is inspired by psycholinguistic evidence for children's sensitivity to the distributional properties of multi-word sequences and for shallow language comprehension based on local information. It learns from corpora of child-directed speech, chunking incoming words together to incrementally build an item-based "shallow parse." When the model encounters an utterance made by the target child, it attempts to generate an identical utterance using the same chunks and statistics involved during comprehension. In Chapter 2, I show that the model achieves high performance across over 200 single-child corpora representing 29 languages from the CHILDES database. It also succeeds in capturing findings from children's production of complex sentence types. In Chapter 3, I show that the model captures key developmental psycholinguistic findings on children's language learning and use. Chapter 4 investigates the use of the model for understanding the different outcomes of child first-language learning versus second-language learning in adults, providing evidence that adult learners may rely on more fine-grained linguistic units. Together, the modeling results presented in this dissertation suggest that much of children's early linguistic behavior may be accounted for by item-based learning through on-line processing of simple distributional cues, consistent with the notion that acquisition can be understood as learning to process language.
Psychology; chunking; computational modelling; corpora; language learning; psycholinguistics; statistical learning; Language; Cognitive psychology
Christiansen, Morten H.
Goldstein, Michael H.; Finlay, Barbara L.; Edelman, Shimon J.
PHD of Psychology
Doctor of Philosophy
Attribution 4.0 International
dissertation or thesis