PSYCH OpenIR  > 中国科学院心理研究所回溯数据库(1956-2010)
手持通信设备的多通道文本输入研究
Alternative TitleResearch on multimodal text entry for handled mobile devices
薛立成
Subtype硕士
Thesis Advisor孙向红
2009-05-27
Degree Grantor中国科学院心理研究所
Place of Conferral心理研究所
Keyword多通道文本输入 主观正确率 WOz方法 错误修复 通道转换
AbstractRecently,Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.; 目前,手持通信设备的发展非常迅速,用户范围和应用领域日益广泛,具有广阔的发展前景。本研究针对目前使用广泛的语音与笔组合,基于双通道系统信息处理的一般模型,同时借鉴多通道地图交互系统人因学的研究方法,对手持通信设备多通道文本输入的系统接受性及系统使用的行为特点进行了初步探讨。 在系统接受性研究中,采用绿野仙踪(Wizard of Oz,WOz)方法与事后问卷调查相结合的方法,考察语音识别准确率的主观感知(主观正确率)。结果表明,语音识别准确率与主观正确率呈线性关系;随着系统熟悉性增加,可接受的语音识别准确率与主观正确率的差异逐渐减小;语义接近程度是评价语音识别正确率的重要标准。 在系统使用行为特点研究中,考察了输入、纠错和通道间转换三个方面。错误修复行为特点研究的结果表明,被试更喜欢使用手写修复错误,输入方式对修复通道的选择影响较小。设置全面的文本类型,考察了文本输入的行为特点。结果表明,在词句输入中,被试更喜欢语音输入方式,个体之间具有较高一致性。在单字输入中,二者之间没有明显差异。在混合文本(特别是中英混合文本)输入中,被试喜欢使用直接输入策略,个体之间具有高度一致性。通过记录不同类型通道转换所需的时间,探查通道转换类型的认知需求差异。结果表明,不同类型的通道转换有着不同的认知需求。语音到手写转换的认知需求较少。 基于本研究的主要结果,提出以下几点建议: 第一,评价语音识别系统时,要注意语音识别准确率与主观正确率并非完全对应,存在不敏感区间。 第二,通过语音输入训练,给予语音识别准确率反馈,增加用户对系统的熟悉程度并提高其语音识别准确率感知的正确性,从而提高系统接受性。 第三,兼顾普遍性和个体差异,设计个性化的修复功能。 第四,减少和易化语音输入的操作,为语音输入的学习和使用提供方便。 第五,提供快捷的、支持非中文文本的输入方式。 第六,手写与语音之间转换的时间可以为语音识别系统自动激活的时间设置提供重要参考。
Pages48
Language中文
Document Type学位论文
Identifierhttp://ir.psych.ac.cn/handle/311026/4758
Collection中国科学院心理研究所回溯数据库(1956-2010)
Recommended Citation
GB/T 7714
薛立成. 手持通信设备的多通道文本输入研究[D]. 心理研究所. 中国科学院心理研究所,2009.
Files in This Item:
File Name/Size DocType Version Access License
10001_20062801250301(1619KB) 限制开放--View Application Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[薛立成]'s Articles
Baidu academic
Similar articles in Baidu academic
[薛立成]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[薛立成]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 10001_200628012503016薛立成_paper.pdf
Format: Adobe PDF
This file does not support browsing at this time
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.