PSYCH OpenIR  > 健康与遗传心理学研究室
作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究
Alternative TitleThe Objectiveness of Writing Assessments:A Comparison between the Multistage Rating Augmentation Method and the Holistic Rating Method
刘斯佳
Subtype同等学力硕士
2015-07
Degree Grantor中国科学院研究生院
Place of Conferral北京
Degree Discipline心理学
Keyword作文考试 整体评分 分步增值评分 多面Rasch模型 评分效率
Abstract作文评分被认为可以更好地反映实际能力而受人青睐,然而其评分质量和客观性却存在着质疑。评分质量包含了两个层面的内容,其一是通过统计指标量化成绩与能力的对应程度,其二是成绩被有效理解和使用的实用意义。本研究率先采纳了王博等(2012)介绍的分步增值评分模式以及传统整体双评评分模式,通过传统测量学模型以及多面Rasch模型,分别对500份国家级作文考试答卷的评分情况进行了考察。研究另外抽取了一部分答卷通过专家评分,考察了两种评分模式的误差程度及其评分效率。 研究一发现作文评分过程中,不论是分步增值评分模式还是传统双评评分模式,评分成绩的不一致情况是非常普遍的;分步增值评分模式相对于传统双评评分模式,在成绩分布情况上显得更为合理,并且评分一致性更好。 研究二发现分步增值评分模式相对于传统双评评分模式,评分误差较低;甚至,分步增值评分1评的误差相对传统双评评分的结果更小。 研究三发现分步增值评分模式相对于传统双评评分模式,在概率曲线的分布情况上显得更为合理。 研究四通过多面Rasch模型发现,分步增值评分模式相对于传统双评评分模式,对于各个评分者在评分偏差度和评分成绩的区分度的表现显得更好。 研究五发现分步增值评分模式相对于传统双评评分模式,在保证评分误差的情况下评分效率更好。 最后,研究六发现分步增值评分模式相对于传统双评评分模式,对于不同性别和学历的评分者在评分偏差度和评分成绩的区分度的表现显得更好。 结果表明分步增值评分模式的确能够提高评分成绩的一系列统计学指标,并且分步增值评分模式在有效理解和使用的实用性层面也更加优越。因此,分步增值评分模式在评分质量的两个层面均有建设性意义。未来研究中可以考察不同类型的评分者所表现出的评分质量,以及对评分维度进行量化分析来简化和明确主观试题的评分依据。 关键词:作文考试;整体评分;分步增值评分;多面Rasch模型;评分效率
Other AbstractWriting assessment has been regarded better representing the real life abilities of the candidate, thus received much favoritism; but the objectiveness of such assessment remained questionable. The quality of the assessment includes two aspects, for one, it is the scores of the tests to represent the actual targeted ability of an individual, for the other part, it is the interpretation and utilization of the tests from the practical senses. This research firstly applied the multistage rating augmentation method of writing assessment introduced by Wang et al. (2012) and the traditional holistic rating method, to assess their soundness in scoring 500 papers of a national-wise writing assessment. This was based both on the traditional psychometric methods and the multifaceted Rasch model. The research also select a proportion of the papers that were to rate by a panel of experts for the assessments of rating errors and efficiency. Study 1 suggest that regardless of the multistage rating augmentation method or the holistic rating method, rating by two raters were often not at the same scoring levels; but the multistage rating augmentation method revealed more optimal score distribution, and consistency. Study 2 suggest that the multistage rating augmentation method, as compared to the holistic rating method, illustrate smaller rating errors; and this was revealed even when only one set of the multistage rating scores was included. Study 3 further suggest that as compared to the traditional holistic rating method, the multistage rating augmentation method demonstrated better category probability curve distribution. Study 4 detailed better performances in terms of misfit and discrimination when the multistage rating augmentation method was compared to the holistic rating method. Study 5 suggest that as compared to the traditional holistic rating method, the multistage rating augmentation method obtained better rating efficiency by maintaining less rating errors based on a single rater. Lastly, Study 6 further showed that raters with different gender and education levels better performed in terms of misfit and discrimination when the multistage rating augmentation method was compared to the holistic rating method. The results showed that the multistage rating augmentation method enhanced the scores of the tests to represent targeted abilities from a statistical sense, and the multistage rating augmentation method demonstrated superiority even in terms of the interpretation and utilization of the tests from a practical sense. Thus, the multistage rating augmentation method shown constructive contribution to the quality of subjective assessment. Future studies could further investigate the rating qualities of raters of different categories, as well as clarifying the dimensions of writing assessment in simplifying and clarifying the objective evidences regarding subjective assessment. 
Subject Area应用心理学
Language中文
Document Type学位论文
Identifierhttp://ir.psych.ac.cn/handle/311026/20553
Collection健康与遗传心理学研究室
Affiliation中国科学院心理研究所
Recommended Citation
GB/T 7714
刘斯佳. 作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究[D]. 北京. 中国科学院研究生院,2015.
Files in This Item:
File Name/Size DocType Version Access License
作文评分的客观性:分步增值评分模式和整体(915KB)学位论文 限制开放CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[刘斯佳]'s Articles
Baidu academic
Similar articles in Baidu academic
[刘斯佳]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[刘斯佳]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 作文评分的客观性:分步增值评分模式和整体双评评分模式的比较研究.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.