Skip to main content
Start main content

Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers

Tang, T. P-Y., Lau, D. K-Y., & Leung, M. T. (2023). Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers. Frontiers in Psychology, 14, Article 1234525. https://doi.org/10.3389/fpsyg.2023.1234525

 

Abstract
In studying language acquisition in children, sizable research studies have been focusing on the investigation of form and lexical semantics. This study aims to establish a child language database annotated both syntactically with part of speech and semantically with semantic content category to supplement the study of child language acquisition in the semantic domain beyond lexical level. The Corpus of Mandarin Child Language (CMCL) that documented the production of different semantic content categories by Mandarin-speaking children was established. Naturalistic language samples of 82 native Mandarin-speaking children aged 25–60 months, divided into three age groups, were obtained. The corresponding semantic content categories coded in each utterance were tagged according to previous studies, in addition to the annotations of part of speech. MLU and lexical diversity were examined and the usage and acquisition of different semantic content categories were also analyzed. The results regarding syntactic complexity and lexical diversity replicated the typical language acquisition pattern from previous studies, which supported the validity of the data obtained in the CMCL. To investigate the trajectory of acquisition of various semantic content categories by age, a 90% acquisition criterion was used. Our findings regarding the acquisition order of semantic content category were basically in line with previous studies in general, with some minor differences. This acquisition order observed is largely explained by the cognitive and syntactic complexity associated with the semantic content category, with additional influence from language specific properties and cultural specific factors of Mandarin. In addition, with the tags in both part-of-speech and semantic content category, the CMCL potentially provides a platform for examining the form-content interface in early child language acquisition, which also implies significantly on the theoretical and clinical ground.

 

FH_23Link to publication in Frontiers

FH_23Link to publication in Scopus

 

 

Your browser is not the latest version. If you continue to browse our website, Some pages may not function properly.

You are recommended to upgrade to a newer version or switch to a different browser. A list of the web browsers that we support can be found here