In Scenario 1, this type of quiz begins class and is not recorded.
What is a pop reading quiz (formative, 10 short-answer items)?
Bachman & Palmer’s umbrella concept asking “Does the test do what it’s meant to do?”
What is test usefulness?
“Students will learn tag questions” is not assessable until you state this clearly.
What is the performance/construct (what learners can DO, in which mode/context)?
“Skills assessed, item types, tasks, scoring, and reporting” collectively form this plan.
What are test specifications (specs)?
The correct MCQ choice is the key; all others are called these.
What are distractors?
Scenario 2’s course focuses on this language domain and ends with MCQ, cloze, and editing.
What is a grammar unit test on verb tenses?
This impact focuses on how tests positively influence teaching/learning before/after.
What is beneficial washback?
For Scenario 2, content validity suggests sampling these two broad performance types.
What are comprehension and production?
Classroom specs should align with this validity principle by mirroring course tasks.
What is content validity (representativeness)?
MCQs are tempting for practicality/reliability but mainly test this kind of knowledge.
What is recognition/selective response?
Scenario 3 values this over quantity during a 90-minute in-class task with later peer conferences.
What is quality of writing in a midterm essay?
One checklist question asks if results should be used to judge these predetermined targets.
What are curricular standards?
Give one example of a precise performance statement for the tense unit.
What is “recognize written present perfect forms in familiar contexts” (or “produce oral simple past in familiar contexts”)?
Scenario 3 specs include stating these four rubric dimensions on the prompt.
What are content, organization, rhetorical discourse, grammar/mechanics?
This index is simply the % who got an item right (ideal band ≈ .15–.85).
What is item facility (IF)?
Scenario 4 combines a 20-minute listening section with this 3-minute task per student.
What is a one-on-one oral interview?
Deciding not to test because it’s just “Week 3 Friday” violates this planning step.
What is determining purpose (don’t test by habit; test for a reason)?
Listing which verbs/regular vs. irregular forms will be covered belongs here, not as vague goals.
What are detailed construct specifications (refining constructs)?
In multi-skill exams (Scenario 4), specs often include these child-friendly stimuli.
What are pictures/picture-cued tasks?
This index shows how well an item separates high/low performers (closer to 1.0 = better).
What is item discrimination (ID)?
Name two reasons Scenario 1’s quiz offers beneficial washback despite no grade.
What are (a) self-assessment of comprehension and (b) a springboard for teacher-led discussion?
Name three distinct stakeholders/targets that the purpose/usefulness checklist explicitly considers.
What are students, the teacher’s next teaching steps, and the course’s significance (relative weighting/impact)?
Why might you exclude some objectives from one test, even if taught?
What are practical constraints (time/weighting), prior informal evidence, and focus on priority constructs?
Give two practical decisions specs force you to make for Scenario 2’s grammar test.
What are (a) which subskills to include/weight and (b) how to elicit/score (e.g., record oral vs. skip due to practicality)?
Name two red flags for weak distractor efficiency.
What are (a) a distractor no one chooses and (b) distractors attracting stronger students more than weaker ones?