Show Less
Restricted access

Validating Analytic Rating Scales

A Multi-Method Approach to Scaling Descriptors for Assessing Academic Speaking


Armin Berger

This book presents a unique inter-university scale development project, with a focus on the validation of two new rating scales for the assessment of academic presentations and interactions. The use of rating scales for performance assessment has increased considerably in educational contexts, but the empirical research to investigate the effectiveness of such scales is scarce. The author reports on a multi-method study designed to scale the level descriptors on the basis of expert judgments and performance data. The salient characteristics of the scale levels offer a specification of academic speaking, adding concrete details to the reference levels of the Common European Framework. The findings suggest that validation procedures should be mapped onto theoretical models of performance assessment.
Show Summary Details
Restricted access

10 Conclusion


10  Conclusion

The primary purpose of this study was to investigate the progression of speaking proficiency operationalised by the Austrian English Language Teaching and Testing (ELTT) initiative – a group of applied linguists and language teaching experts from four Austrian university English departments – in two analytic rating scales. The main focus was the extent to which the level descriptors of the ELTT rating scales represent an empirically sound pattern of increasing speaking proficiency. Although the ELTT group had used a methodologically triangulated approach to scale construction, it would have been wrong to assume that the scales formed implicational scales of speaking proficiency a priori. Instead, empirical research needed to show that the scale descriptors represent a meaningful continuum. To this end, a multi-method approach was designed to validate the ELTT scales. Building on the results of a preliminary investigation within classical test theory (Berger 2012), the present study developed a probability-based research design including multi-faceted Rasch analysis in order to relate the scale descriptors to actual speaking performances and to obtain an empirical scale of performance descriptions. The project can thus be considered a construct validation study that is methodologically related to the scaling approach most notably associated with the work of North (1995, 2000, 2002) and North and Schneider (1998) in the context of developing a common European framework for reporting language competency, which resulted in the illustrative scales of the Common European Framework of Reference for Languages (CEFR). The present project was conceptually different, however, in the...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.