|Alternative Title||The competition mechanism in word segmentation and recognition during Chinese reading|
|Place of Conferral||北京|
|Keyword||中文阅读 词切分与识别 交集歧义字段 词汇竞争 汉字位置编码|
该论文包括两部分研究。研究1包含3 个实验，系统探讨了交集歧义字段的切分与识别。交集歧义字段指的是三字交集歧义字段，譬如“学生活”，中间汉字和左右两侧汉字都能组成词汇（我们称为左侧词汇和右侧词汇）。实验 1采用部分报告法探讨是否存在左侧加工优势。实验要求被试命名交集歧义字段中间的汉字，该汉字是多音字，譬如“卫校订”，我们操纵了该字段左右词汇的频率，一个是高频词而另外一个是低频词。结果发现，中间汉字的命名偏向于高频词汇中的发音，而不管该高频词汇在左侧还是右侧。因此，实验1否定了左侧优势假设，因为右侧词汇也能优先获胜。实验2将配对的三字歧义字段嵌套在不同的句子框架下，用眼动追踪设备记录读者的句子阅读行为。三字歧义字段左侧词汇相同，右侧词汇的频率一高一低。我们发现歧义字段右侧词汇的词频影响了左侧词汇的加工。该数据表明，歧义字段两侧词汇的加工并不是相互独立的，而是相互影响的。实验3依然采用句子阅读任务。该实验不仅操纵了交集歧义字段的左右词频，而且控制了句子语境，同样的交集歧义字段在不同的语境下形成两种不同的切分：AB-C 或A-BC。结果发现，词汇竞争在早期阶段依赖于局部词频线索，高频词汇更容易竞争获胜，当依赖于局部词频的切分与句子背景不一致时，读者对歧义字段的第二遍阅读时间以及回视次数都显著高于一致条件。研究1 的系列实验支持了竞争假设，即知觉广度内所有汉字组成的词汇都会被激活，并参与词汇竞争，激活水平最高的竞争获胜，并被识别和切分出来。
研究2 进一步推广了歧义字段切分的竞争机制，探讨了是否存在跨汉字词汇的激活？跨汉字组词是中文阅读中常见的现象，譬如：在公司名字“北大方正”中，第一个汉字与第三个汉字可以组成词汇“北方”。研究2探讨了这类词汇在阅读中能否被识别的问题。该研究包括两个实验。实验4采用汉字识别任务来探讨跨汉字词汇能否被识别出来。在四个汉字ABCD 中，AB 和CD 是两个双字词，一种条件下AC 能够组成词汇，譬如“素食质点”；一种条件下不能，譬如“素食助教”。我们发现读者在AC 组词条件下，报告汉字AC 的概率显著高于AC 非词条件。该实验证明了词汇识别可以突破词边界，跨汉字词汇能够被激活并与左侧词汇形成竞争。在实验5 中，我们探讨了句子阅读中跨汉字词汇的加工，跨汉字组词和控制条件下的四字字段嵌套在同样的句子框架下。结果表明，在跨汉字组词条件下，读者在相应区域的注视时间显著增长。研究2 拓展了研究1 的发现，证明了跨汉字词汇的激活，并参与词切分与识别的竞争过程。
|Other Abstract||Compared with most alphabetic languages, one special property of Chinese language is that there are no spaces between Chinese words. However, previous studies have shown that words have a psychological reality in Chinese reading. Thus, it is important to investigate how Chinese readers group contiguous characters into separate words. In this dissertation, we explored the mechanism of word segmentation using creative paradigms. |
This thesis includes two main studies. In the first study, we explored how Chinese readers segmented a 3-character overlapping ambiguous string where the middle character could constitute a word with both the first and third character. The first study contained 3 experiments. In Experiment 1, subjects named the middle character, which was a polyphone. They tended to pronounce it as if it belonged to the higher-frequency word, regardless of its position (left or right). The results were inconsistent with left-priority hypothesis which supposed that only the left-hand word win the competition. In Experiment 2, we embedded two sets of overlapping ambiguous strings with identical left-hand words (AB) but different right-hand words (BC or BD) in the same sentence frames. Fixation times were longer on AB when the right-hand word was of higher frequency. These results were not consistent with an independent processing hypothesis which proposed that the bilateral words did not influence with each other. In Experiment 3, each 3-character string was embedded into two sentences (that only differed after the critical 3-character strings) which constrained the overlapping ambiguous string so that it could be either segmented as AB-C or A-BC. The frequencies of the two words in the string were also manipulated such that it could be segmented as AB-C if the frequency of AB was higher than BC and as A-BC if the frequency of BC was higher than AB. Second-pass reading time was shorter and regression-in probability was lower in the ambiguous region when the segmentation fit with the sentence context than when it did not. All these results support a competition mechanism where that the characters in the perceptual span activate all of the words they can constitute, and any word (left-hand or right-hand word) can win the competition if its activation is high enough.
In the second study, we extended the competition mechanism into reading Chinese texts to another situation. The concrete question was whether readers could recognize a word composed of noncontiguous characters (a cross-character word). In Experiment 4, participants were asked to report as many characters as possible after they briefly viewed four Chinese characters ABCD where both AB and CD were 2-character words. In the cross-character word condition, AC was a word, but in the control condition, AC was not a word. Readers were more likely to report the combination of characters A and C in the cross-character word condition than in the control condition. In Experiment 5, we embedded the two kinds of 4-character strings into the same sentence frame to explore whether cross-character words could be recognized in sentence reading. Readers spent more time locally in the cross-character word condition than in the control condition. These results suggested that the cross-character word is activated and the activated word participates in the word competition process during Chinese reading.
Finally, we discussed the relationship between word segmentation and word recognition. We also raised some critical questions on previous models of word segmentation and recognition in Chinese reading and provided potential methods to improve it based on the present study.
|马国杰. 中文阅读中词切分与识别的竞争机制[D]. 北京. 中国科学院研究生院,2015.|
|Files in This Item:|
|马国杰-博士学位论文.pdf（2226KB）||学位论文||限制开放||CC BY-NC-SA||View Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.