PSYCH OpenIR  > 认知与发展心理学研究室
汉语语篇文本一语音对接语料库的构建及相关研究
Alternative TitleConstructure and Study of Corpus on Text-Speech in Chinese Discourse
陈玉东
2009-05
Publication Place北京
Subtype最终报告
Contribution Rank1
Abstract

本文报告的工作是国家自然科学基金项目《汉语语篇中语句焦点和焦点一重音投射》研究的重要基础和主要成果。报告介绍了语料库建设的规划实施和基于语料库的相关初步研究。

语料库的建设工作首先从实施方案的制定开始,然后是语料的搜集、录制和整理等基础工作。标注体系主要包括文本标注、语音标注和语篇修辞结构三个部分。系统的培训和试标,反复讨论,进一步统一标准,修正标注规则,确保语料加工的质量。一专家的指导,标注人在标注过程中信息的不断反馈,对语料加工的内部一致性和彼此标准的统一性都发挥了重要作用。解决了不同层级标注信息的对接问题,搭建便于导入、检索和统计,并一可以无限扩充的语料库平台,建成了初具规模的、语音、文本君}语篇结构三个部分相互支撑的汉语语篇文本一语音对接语料库。

基于语料库的初步相关研究,包括两个部分,一个是关于语篇修辞结构和韵律表达的个案研究,一个是不同语体朗读中的重一音特征的研究。报告还总结了工作存在的问题,分析了进一步工作的有利条件和存在的实际困难,展望了本语料库的应用前景。

Other Abstract

This work is an important base and achievement of the research project on Sentence Focus and Focus-accent Projection in Chinese Discourse from National Natural Science Foundation of China. In this paper, the scheme and implement of corpus construct and the elementary studies based on it were reported.

A suit of projects were established at first, and the collection, recording and arrangement of datum were advanced in accordance with it. The annotation system of corpus consists of three pants: text, speech and rhetoric structure of discourse. By all-around training and sample annotation, discussions and consolidated criterions, the quality of datum processing was insured. Expert's guidance and annotator's feedback play important roles in inner accordance and criterion oneness. Different layers of annotation information were jointed and integrated into the corpus, which is propitious for datum putting-in, query, statistic and unlimited expansion. A corpus composed of three parts which are text, speech and discourse structure and support each other was built.

The two elementary studies based on the corpus were reported. The one is a case study on the rhetoric stz0ucture and prosodic expression in Chinese discourse, and the other is an investigation on the characters of accents in different styles of Chinese discourse recitation. The advantaged condition and actual difficulties of next work was analyzed, and the application foreground was expected.

Keyword语篇结构 焦点 重音 语音 文本
Pages99
Language中文
Document Type科技报告
Identifierhttp://ir.psych.ac.cn/handle/311026/29123
Collection认知与发展心理学研究室
Affiliation中国科学院心理研究所
Recommended Citation
GB/T 7714
陈玉东. 汉语语篇文本一语音对接语料库的构建及相关研究[R]. 北京,2009.
Files in This Item:
File Name/Size DocType Version Access License
陈玉东-博士后研究报告.pdf(7762KB)科技报告 限制开放CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[陈玉东]'s Articles
Baidu academic
Similar articles in Baidu academic
[陈玉东]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[陈玉东]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 陈玉东-博士后研究报告.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.