Show Less
Restricted access

Introduction to Many-Facet Rasch Measurement

Analyzing and Evaluating Rater-Mediated Assessments. 2nd Revised and Updated Edition


Thomas Eckes

Since the early days of performance assessment, human ratings have been subject to various forms of error and bias. Expert raters often come up with different ratings for the very same performance and it seems that assessment outcomes largely depend upon which raters happen to assign the rating. This book provides an introduction to many-facet Rasch measurement (MFRM), a psychometric approach that establishes a coherent framework for drawing reliable, valid, and fair inferences from rater-mediated assessments, thus answering the problem of fallible human ratings. Revised and updated throughout, the Second Edition includes a stronger focus on the Facets computer program, emphasizing the pivotal role that MFRM plays for validating the interpretations and uses of assessment outcomes.
Show Summary Details
Restricted access

7. Criteria and Scale Categories: Use and Functioning


7.   Criteria and Scale Categories: Use and Functioning

Most often raters provide judgments using rating scales, where ordered categories are supposed to represent successively higher levels of performance. In analytic scoring, raters consider a given set of features of the performance of interest, and provide a separate score for each feature or criterion. In another approach, the holistic scoring, raters assign a single score to the whole performance. With each kind of scoring approach, raters need to distinguish between the different scale categories, awarding the score that best fits the performance at hand. In the case of analytic scoring they additionally need to distinguish between the different scoring criteria or subscales. The first section of this chapter deals with measurement results that are relevant to evaluating the functioning of a given set of scoring criteria. In the second section, differences between manifest and latent rating scale structures are pointed out. The final section presents statistical indicators of the usefulness and effectiveness of rating scale categories.

7.1    Criterion measurement results

In the sample data, global impression, task fulfillment, and linguistic realization were the elements of the criterion facet. Each criterion represented a distinct set of attributes that raters took into account when scoring an essay. One of the main goals of analyzing the criterion facet was to provide insight into the relative difficulty of the criteria, the precision of the difficulty estimates, and the degree to which the criteria worked together to define a single...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.