This concept pertains to whether an assessment actually measures the construct it aims to measure
validity
Psychometric measurement tools (like the TOEFL) typically have high reliability and validity but rate low on this concept
authenticity
This ensures that a measurement tool can elicit the same result on repeated administrations or if two versions of a test elicit the same results
reliability
This type of assessment often has negative washback if students study for the test at the exclusion of real-world language
Traditional Assessment (or Assessment of Learning)
This reflects language that is as natural as possible, in meaningful contexts, and adhering to real-world communication
authenticity
While they may score lower on reliability and practicality, these assessment methods tend to have a positive washback effect
Alternative Assessment
This addresses the feasibility of developing, administering, and interpreting the results of an assessment, such as whether it can be used for large-scale evaluation
practicality
Alternative assessments often score lower in this category because they are frequently time-consuming
practicality
This refers to the impact (positive or negative) that an assessment has on pedagogy and learning
washback effect
Using carefully designed rubrics in alternative assessment is a strategy used to improve this specific concept
reliability