手持通信设备的多通道文本输入研究

PSYCH OpenIR > 中国科学院心理研究所回溯数据库(1956-2010)

	手持通信设备的多通道文本输入研究
其他题名	Research on multimodal text entry for handled mobile devices
	薛立成
	2009-05-27
摘要	Recently，Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.; 目前，手持通信设备的发展非常迅速，用户范围和应用领域日益广泛，具有广阔的发展前景。本研究针对目前使用广泛的语音与笔组合，基于双通道系统信息处理的一般模型，同时借鉴多通道地图交互系统人因学的研究方法，对手持通信设备多通道文本输入的系统接受性及系统使用的行为特点进行了初步探讨。在系统接受性研究中，采用绿野仙踪（Wizard of Oz，WOz）方法与事后问卷调查相结合的方法，考察语音识别准确率的主观感知（主观正确率）。结果表明，语音识别准确率与主观正确率呈线性关系；随着系统熟悉性增加，可接受的语音识别准确率与主观正确率的差异逐渐减小；语义接近程度是评价语音识别正确率的重要标准。在系统使用行为特点研究中，考察了输入、纠错和通道间转换三个方面。错误修复行为特点研究的结果表明，被试更喜欢使用手写修复错误，输入方式对修复通道的选择影响较小。设置全面的文本类型，考察了文本输入的行为特点。结果表明，在词句输入中，被试更喜欢语音输入方式，个体之间具有较高一致性。在单字输入中，二者之间没有明显差异。在混合文本（特别是中英混合文本）输入中，被试喜欢使用直接输入策略，个体之间具有高度一致性。通过记录不同类型通道转换所需的时间，探查通道转换类型的认知需求差异。结果表明，不同类型的通道转换有着不同的认知需求。语音到手写转换的认知需求较少。基于本研究的主要结果，提出以下几点建议：第一，评价语音识别系统时，要注意语音识别准确率与主观正确率并非完全对应，存在不敏感区间。第二，通过语音输入训练，给予语音识别准确率反馈，增加用户对系统的熟悉程度并提高其语音识别准确率感知的正确性，从而提高系统接受性。第三，兼顾普遍性和个体差异，设计个性化的修复功能。第四，减少和易化语音输入的操作，为语音输入的学习和使用提供方便。第五，提供快捷的、支持非中文文本的输入方式。第六，手写与语音之间转换的时间可以为语音识别系统自动激活的时间设置提供重要参考。
关键词	多通道文本输入主观正确率 WOz方法错误修复通道转换
学位类型	硕士
语种	中文
学位授予单位	中国科学院心理研究所
学位授予地点	心理研究所
文献类型	学位论文
条目标识符	https://ir.psych.ac.cn/handle/311026/4758
专题	中国科学院心理研究所回溯数据库(1956-2010)
推荐引用方式 GB/T 7714	薛立成. 手持通信设备的多通道文本输入研究[D]. 心理研究所. 中国科学院心理研究所,2009.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
10001_20062801250301（1619KB）			开放获取	--	请求全文