Show Less
Restricted access

Introduction to Many-Facet Rasch Measurement

Analyzing and Evaluating Rater-Mediated Assessments. 2nd Revised and Updated Edition


Thomas Eckes

Since the early days of performance assessment, human ratings have been subject to various forms of error and bias. Expert raters often come up with different ratings for the very same performance and it seems that assessment outcomes largely depend upon which raters happen to assign the rating. This book provides an introduction to many-facet Rasch measurement (MFRM), a psychometric approach that establishes a coherent framework for drawing reliable, valid, and fair inferences from rater-mediated assessments, thus answering the problem of fallible human ratings. Revised and updated throughout, the Second Edition includes a stronger focus on the Facets computer program, emphasizing the pivotal role that MFRM plays for validating the interpretations and uses of assessment outcomes.
Show Summary Details
Restricted access

3. Rater-Mediated Assessment: Meeting the Challenge


3.   Rater-Mediated Assessment: Meeting the Challenge

Human raters are fallible: Each time a rater provides a score that is meant to express an evaluation of the quality of an examinee’s response to a particular task, that score is likely to misrepresent the proficiency of the examinee to some extent. However, raters are not the only source of measurement error. A host of other facets also come into play that may have a similarly adverse impact on the assessment outcomes. This chapter first discusses the notorious problem of rater variability and presents the standard approach to dealing with that problem. Then the essay rating data are used to illustrate the computation and interpretation of different indices of interrater reliability and to highlight the limitations of the standard approach. The final section outlines the conceptual–psychometric framework that underlies the Rasch measurement approach to meeting the challenge of rater-mediated assessment.

3.1  Rater variability

Performance assessments typically employ constructed-response items, requiring examinees to create a response, rather than choose the correct answer from alternatives given. To arrive at scores capturing the intended proficiency, raters have to closely attend to, interpret, and evaluate the responses that examinees provide. In keeping with the rater cognition perspective of performance assessment (e.g., Bejar, 2012; Lumley, 2005; see also Section 10.4), the process of assessing examinee performance can be described as a complex and indirect one: Examinees respond to assessment items or tasks designed to represent the underlying construct, and raters perceive,...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.