Validating Analytic Rating Scales
A Multi-Method Approach to Scaling Descriptors for Assessing Academic Speaking
Summary
Excerpt
Table Of Contents
- Cover
- Title
- Copyright
- About the Author
- About the Book
- This eBook can be cited
- Table of Contents
- Acknowledgements
- List of figures
- List of tables
- List of abbreviations
- 1 Introduction
- 1.1 Background to the study
- 1.2 Statement of the problem
- 1.3 Purpose of the study
- 1.4 Research questions
- 1.5 Structure of the book
- 2 Performance assessment of second language speaking
- 2.1 Introduction to performance assessment
- 2.2 The speaking construct in performance assessment
- 2.2.1 Pre-communicative approaches
- 2.2.2 Models of communicative competence
- 2.2.3 Approaches to speaking
- 2.3 Models of performance assessment
- 2.3.1 McNamara (1996)
- 2.3.2 Skehan (1998, 2001)
- 2.3.3 Bachman (2002)
- 2.3.4 Fulcher (2003)
- 2.4 Rating scales in performance assessment
- 3 Rating scales
- 3.1 General characteristics
- 3.2 Types of rating scales
- 3.3 Theoretical and methodological concepts in rating scale development
- 3.3.1 Intuitive approaches
- 3.3.2 Theory-based approaches
- 3.3.3 Empirical approaches
- 3.3.4 Triangulation of approaches
- 3.4 Controversy over rating scales
- 4 Rating scale validation
- 4.1 Validity and validity evidence
- 4.2 Rasch-based rating scale validation
- 4.3 Dimensionality
- 4.4 Conclusion
- 5 The ELTT rating scales
- 5.1 The development process
- 5.1.1 Intuitive phase
- 5.1.2 Qualitative phase
- 5.2 The ELTT construct
- 5.2.1 Lexico-grammatical resources and fluency
- 5.2.2 Pronunciation and vocal impact
- 5.2.3 Structure and content
- 5.2.4 Genre-specific presentation skills: formal presentations
- 5.2.5 Content and relevance (interaction)
- 5.2.6 Interaction
- 5.3 Descriptor formulation
- 5.4 ELTT speaking ability
- 5.5 Conclusion
- 6 Descriptor sorting
- 6.1 Validating the ELTT scales
- 6.2 Rationale
- 6.3 Methodology
- 6.3.1 Participants
- 6.3.2 Instruments and procedures
- 6.4 Analysis
- 6.5 Results and discussion
- 6.5.1 Inter-rater reliability
- 6.5.2 Match between intended and empirical scale
- 6.5.3 Descriptor analysis
- 6.6 Preliminary conclusions
- 6.6.1 Level allocation
- 6.6.2 Specificity of proficiency levels
- 6.6.3 Descriptor wording
- 6.6.4 Recommendations for scale revision
- 6.7 Conclusion
- 7 Descriptor calibration
- 7.1 Rationale
- 7.2 Analysis
- 7.2.1 Rasch measurement
- 7.2.2 Specification of a measurement model and FACETS output
- 7.2.3 Measurement quality control
- 7.2.4 Descriptor analysis
- 7.3 Results and discussion
- 7.3.1 Measurement quality control
- 7.3.2 Dimensionality of descriptors
- 7.3.3 The proficiency continuum
- 7.3.4 Cut-off points and content integrity
- 7.4 Conclusion
- 8 Descriptor-performance matching
- 8.1 Rationale
- 8.2 Methodology
- 8.2.1 Participants
- 8.2.2 Instruments and procedures
- 8.2.3 Data collection
- 8.3 Analysis
- 8.3.1 Specification of a measurement model
- 8.3.2 Measurement quality control
- 8.4 Results and discussion
- 8.4.1 Measurement quality control
- 8.4.2 Dimensionality of descriptors
- 8.4.3 The proficiency continuum
- 8.4.4 Cut-off points and content integrity
- 8.5 Conclusion
- 8.6 Comparison of methods
- 9 Revision of the ELTT scales
- 9.1 Establishing a quality hierarchy of descriptor units
- 9.2 The quality of descriptor units
- 9.3 Constructing the revised scales
- 9.4 Common points of reference
- 9.5 The modified versions of the ELTT scales
- 10 Conclusion
- 10.1 Summary
- 10.2 Theoretical implications
- 10.3 Practical recommendations
- 10.4 Limitations of the study
- 10.5 Suggestions for further research
- 10.6 Concluding statement
- 11 References
- 12 Appendix
- 12.1 Appendix 1: Original ELTT rating scales
- 12.2 Appendix 2: Sorting task questionnaire
- 12.3 Appendix 3: Consensual scales based on descriptor sorting
- 12.4 Appendix 4: Descriptor unit measurement report (descriptor calibration)
- 12.5 Appendix 5: All facet vertical ruler (sorting task)
- 12.6 Appendix 6: Speaking tasks
- 12.7 Appendix 7: Rating sheets
- 12.8 Appendix 8: Rater guidelines
- 12.9 Appendix 9: Student measurement report (descriptor-performance matching)
- 12.10 Appendix 10: All facets vertical ruler (descriptor-performance matching)
- 12.11 Appendix 11: Descriptor unit measurement report (descriptor-performance matching)
I would like to express my sincere gratitude to all those – far too numerous to mention here – who supported me during my academic journey. In particular, I wish to thank Christiane Dalton-Puffer, Günther Sigott, Tim McNamara, Charles Alderson, Ari Huhta, Rita Green, and Hermann Cesnik for the opportunity to discuss my work with them. Their insightful, instructive, and wholly useful feedback helped me shape this research. The responsibility for any errors or inadequacies that may occur in this work, of course, is entirely my own.
Thank you for sharing your great expertise!
Furthermore, I would like to express my gratitude to the members of the ELTT group who developed the two analytic rating scales I was fortunate enough to investigate: Martina Elicker, Helen Heaney, Martin Kaltenbacher, Gunther Kaltenböck, Thomas Martinek, and Benjamin Wright. Working with them has been an enjoyable and educational experience.
Thank you for your commitment to professionalism!
I am deeply indebted to my colleagues who participated as raters in the project: Nancy Campbell, Lucy Cripps, Dianne Davies, Grit Frommann, Meta Gartner-Schwarz, Anthony Hall, Helen Heaney, Claire Jones, Katharina Jurovsky, Gunther Kaltenböck, Christina Laurer, Sandra Pelzmann, Michael Phillips, Horst Prillinger, Karin Richter, Angelika Rieder-Bünemann, Jennifer Schumm Fauster, Gillian Schwarz-Peaker, Nicholas Scott, Susanne Sweeney-Novak, Andreas Weissenbäck, and Sarah Zehentner. I greatly appreciate their willingness to share their expertise and devote time – often enormous amounts – to the project for nothing but sincere gratitude in return.
Thank you for your academic idealism!
I would also like to thank all our students who generously consented to take part in the study. The spectacle of a mock exam and the doubtful privilege of being able to consider themselves participants in a study was a poor reward for real motivation and great service.
Thank you for your academic curiosity! ← 9 | 10 →
On a personal note, I am extremely fortunate to have had the wholehearted love and support of my family and friends. It was their patience and understanding that helped me manage to juggle a full-time teaching job, a research project, and many other professional activities. Words cannot describe the gratitude I feel towards my wife, Angela, who is the greatest source of inspiration in my life, bar none.
Sorry for not always having my priorities right! ← 10 | 11 →
Figure 1: Components of language competence (Bachman 1990: 87)
Figure 2: Components of language competence (Bachman & Palmer 1996: 63)
Figure 3: Levelt’s blueprint for the speaker (Levelt 1989: 9)
Figure 4: A summary of oral skills (Bygate 1987: 50)
Figure 5: Variables influencing performance in a speaking test (McNamara 1996: 86)
Figure 6: Skehan’s (1998: 172) model of oral test performance
Figure 7: Bachman’s (2002: 467) expanded model of oral test performance
Figure 8: Fulcher’s (2003: 115) expanded model of speaking testperformance
Figure 9: A framework for describing approaches to rating scaledevelopment
Figure 10: Messick’s (1989: 20) facets of validity
Figure 11: Facets of rating scale validity (Knoch 2009: 65)
Figure 12: The ELTT scale development process
Figure 13: The ELTT model of speaking ability
Figure 14: Scale category probability curves (descriptor sorting)
Figure 15: Task specifications
Figure 16: Scale category probability curves (descriptor-performance matching)
Figure 17: Classification instrument for assessing descriptor unit quality
Figure 18: Common reference points and descriptor keywords
Figure 19: An expanded model of performance assessment, based on Fulcher (2003) and Knoch (2009)
Figure 20: An expanded model for rating scale development ← 11 | 12 → ← 12 | 13 →
Details
- Pages
- 395
- Publication Year
- 2016
- ISBN (PDF)
- 9783653061833
- ISBN (ePUB)
- 9783653960426
- ISBN (MOBI)
- 9783653960419
- ISBN (Hardcover)
- 9783631666913
- DOI
- 10.3726/978-3-653-06183-3
- Language
- English
- Publication date
- 2015 (December)
- Keywords
- Language testing Language assessment Assessing speaking Performance assessment
- Published
- Frankfurt am Main, Berlin, Bern, Bruxelles, New York, Oxford, Wien, 2015. 395 pp., 39 tables