Basic Principles
Item Development
Scoring
Advanced Principles
Technology-Based Assessment
100

The relationship between the resources that are available and those that are needed in the development and use of the assessment.

What is PRACTICALITY / FEASIBILITY?

100

A detailed plan that specifies the content and format of an assessment, and the procedures and instructions for administering an assessment.

What is an assessment BLUEPRINT?

100

An assessment record that is reported as a number or a letter, for example, as the number of correct responses, a rating, a percentage, or a grade or a mark (e.g. A, B, C or 1, 2, 3).

What is SCORE?

100

The degree to which the format and content of the assessment tasks and all aspects of the administration of the assessment are free from bias that may favor or disfavor some students.

What is IMPARTIALITY?

100

The personalized delivery of assessments to test takers with optimized precision in estimating ability.

What is ADAPTIVE TESTING / COMPUTERISED-ADAPTIVE TESTING?

200

The degree to which students’ performances on different assessments (e.g., different administrations, tasks, and scorers/raters) in the same area of language ability yield essentially the same assessment records.

What is RELIABILITY / CONSISTENCY?

200

A response to an assessment task in which students select one or more responses from among several possible responses that are given.

What is SELECTED RESPONSE?

200

A way of expressing students’ levels of achievement in terms of a number, e.g. 3, 2, 1, with “3” being the highest.

What is MARK?

200

The effect or impact of using an assessment on instruction and learning.

What is WASHBACK?

200

A fixed test form administered to test takers, whereby the items will not change or update once the candidate starts taking the test

What is LINEAR TEST DESIGN?

300

The extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment.

What is VALIDITY?

300

Activities and procedures, task characteristics, and a scoring method in an assessment task template.

What is ASSESSMENT TASK TYPE?

300

The specification for how the teacher will arrive at a score on the basis of students’ performances. It consists of the criteria for evaluating the correctness or quality of the students’ responses, the score to be reported, and the procedures to be followed in arriving at a score.

What is SCORING METHOD?

300

Being able to convince stakeholders that the intended uses of an assessment are justified.

What is ACCOUNTABILITY?

300

A technological method that involves response matching or text/natural language processing to review and evaluate text responses in a reproducible way that matches defined scoring rubrics and is in agreement with human raters, which may use non-AI software to perform a manual task or an AI-based scoring systems.

What is AUTOMATED SCORING?

400

The degree of correspondence of the characteristics of a given language test task to the features of the target language task.

What is AUTHENTICITY?

400

Aspects or features of language use tasks that provide a way to describe the task with more precision than simply giving it a label.

What is TASK CHARACTERISTICS?

400

Also known as a scoring rubric; this specifies different levels on the ability to be assessed, and provides descriptors for each of these levels.

What is RATING SCALE?

400

The degree to which different test takers who are at equivalent levels in their ability to be assessed have equivalent chances of being classified into the same group.

What is EQUITABILITY?

400

An electronic database of test item content, associated attributes (e.g., scoring key, content classification, cognitive level, enemy items) and metadata (e.g., item statistics, historical use), from which test forms may be drawn manually or automatically (in the case of linear-on-the fly testing, LOFT) or items many be selected individually for test delivery (in the case of CAT).

What is an ITEM BANK / ITEM POOL?

500

A quality of assessment use that depends on how well each link from students’ performance on assessment tasks to assessment records, to decisions, and to consequences can be supported or justified to stakeholders.

What is FAIRNESS?

500

A response to an assessment task in which students’ responses can vary from a single word or phrase (spoken or written), to a single sentence or utterance.

What is LIMITED PRODUCTION RESPONSE?

500

A grade or mark that reports how well a student has achieved the learning objectives of a course of instruction.

What is CRITERION-REFERENCED?

500

The degree to which the assessment-based interpretations apply or extend to students’ target language use domains.

What is GENERALIZABILITY?

500

A type of assessment or evaluation that uses elements or mechanics from an artificial conflict, defined by rules, to measure a person’s skills, knowledge, and abilities, including elements such as scoring systems, feedback mechanisms, time limits, and challenges that increase in difficulty as the assessment progresses.

What is GAME-BASED ASSESSMENT?