PSYCH OpenIR  > 健康与遗传心理学研究室
作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究
其他题名The Objectiveness of Writing Assessments:A Comparison between the Multistage Rating Augmentation Method and the Holistic Rating Method
刘斯佳
学位类型同等学力硕士
2015-07
学位授予单位中国科学院研究生院
学位授予地点北京
学位专业心理学
关键词作文考试 整体评分 分步增值评分 多面Rasch模型 评分效率
摘要作文评分被认为可以更好地反映实际能力而受人青睐,然而其评分质量和客观性却存在着质疑。评分质量包含了两个层面的内容,其一是通过统计指标量化成绩与能力的对应程度,其二是成绩被有效理解和使用的实用意义。本研究率先采纳了王博等(2012)介绍的分步增值评分模式以及传统整体双评评分模式,通过传统测量学模型以及多面Rasch模型,分别对500份国家级作文考试答卷的评分情况进行了考察。研究另外抽取了一部分答卷通过专家评分,考察了两种评分模式的误差程度及其评分效率。 研究一发现作文评分过程中,不论是分步增值评分模式还是传统双评评分模式,评分成绩的不一致情况是非常普遍的;分步增值评分模式相对于传统双评评分模式,在成绩分布情况上显得更为合理,并且评分一致性更好。 研究二发现分步增值评分模式相对于传统双评评分模式,评分误差较低;甚至,分步增值评分1评的误差相对传统双评评分的结果更小。 研究三发现分步增值评分模式相对于传统双评评分模式,在概率曲线的分布情况上显得更为合理。 研究四通过多面Rasch模型发现,分步增值评分模式相对于传统双评评分模式,对于各个评分者在评分偏差度和评分成绩的区分度的表现显得更好。 研究五发现分步增值评分模式相对于传统双评评分模式,在保证评分误差的情况下评分效率更好。 最后,研究六发现分步增值评分模式相对于传统双评评分模式,对于不同性别和学历的评分者在评分偏差度和评分成绩的区分度的表现显得更好。 结果表明分步增值评分模式的确能够提高评分成绩的一系列统计学指标,并且分步增值评分模式在有效理解和使用的实用性层面也更加优越。因此,分步增值评分模式在评分质量的两个层面均有建设性意义。未来研究中可以考察不同类型的评分者所表现出的评分质量,以及对评分维度进行量化分析来简化和明确主观试题的评分依据。 关键词:作文考试;整体评分;分步增值评分;多面Rasch模型;评分效率
其他摘要Writing assessment has been regarded better representing the real life abilities of the candidate, thus received much favoritism; but the objectiveness of such assessment remained questionable. The quality of the assessment includes two aspects, for one, it is the scores of the tests to represent the actual targeted ability of an individual, for the other part, it is the interpretation and utilization of the tests from the practical senses. This research firstly applied the multistage rating augmentation method of writing assessment introduced by Wang et al. (2012) and the traditional holistic rating method, to assess their soundness in scoring 500 papers of a national-wise writing assessment. This was based both on the traditional psychometric methods and the multifaceted Rasch model. The research also select a proportion of the papers that were to rate by a panel of experts for the assessments of rating errors and efficiency. Study 1 suggest that regardless of the multistage rating augmentation method or the holistic rating method, rating by two raters were often not at the same scoring levels; but the multistage rating augmentation method revealed more optimal score distribution, and consistency. Study 2 suggest that the multistage rating augmentation method, as compared to the holistic rating method, illustrate smaller rating errors; and this was revealed even when only one set of the multistage rating scores was included. Study 3 further suggest that as compared to the traditional holistic rating method, the multistage rating augmentation method demonstrated better category probability curve distribution. Study 4 detailed better performances in terms of misfit and discrimination when the multistage rating augmentation method was compared to the holistic rating method. Study 5 suggest that as compared to the traditional holistic rating method, the multistage rating augmentation method obtained better rating efficiency by maintaining less rating errors based on a single rater. Lastly, Study 6 further showed that raters with different gender and education levels better performed in terms of misfit and discrimination when the multistage rating augmentation method was compared to the holistic rating method. The results showed that the multistage rating augmentation method enhanced the scores of the tests to represent targeted abilities from a statistical sense, and the multistage rating augmentation method demonstrated superiority even in terms of the interpretation and utilization of the tests from a practical sense. Thus, the multistage rating augmentation method shown constructive contribution to the quality of subjective assessment. Future studies could further investigate the rating qualities of raters of different categories, as well as clarifying the dimensions of writing assessment in simplifying and clarifying the objective evidences regarding subjective assessment. 
学科领域应用心理学
语种中文
文献类型学位论文
条目标识符http://ir.psych.ac.cn/handle/311026/20553
专题健康与遗传心理学研究室
作者单位中国科学院心理研究所
推荐引用方式
GB/T 7714
刘斯佳. 作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究[D]. 北京. 中国科学院研究生院,2015.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
作文评分的客观性:分步增值评分模式和整体(915KB)学位论文 限制开放CC BY-NC-SA浏览 请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[刘斯佳]的文章
百度学术
百度学术中相似的文章
[刘斯佳]的文章
必应学术
必应学术中相似的文章
[刘斯佳]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。