A first-grade teacher uses her student's high score on a phonics check to decide they are ready for a more difficult reading group. She understands that this is about whether her reasoned conclusion or interpretation is accurate for that specific classroom decision.
What is validity?
This occurs when a third-grade teacher gives a quiz on Wednesday and assumes that if her student, “Dazzling Dora,” took it again on Thursday, her score would be close or equivalent to the previous day’s score.
What is test-retest reliability?
A veteran teacher explains that proving a test's quality is “much like the closing argument an attorney presents to a jury.” The jury (staff/team/committee) weaves the evidence together to reach a verdict on the test’s suitability.
What is a validity argument?
A second-grade teacher describes this trait as validity’s best friend and a synonym for consistency. This is an attribute she cherishes as much as “motherhood, apple pie, or cuddly puppies” when it comes to creating grade-level tests.
What is reliability?
When a student misses the initial benchmark test due to having a virus, the school counselor provides another (supposedly equivalent) version of the test to ensure it is an equally difficult challenge.
What is alternate-form reliability?
A third-grade teacher reminds his team that while a test's score might be accurate, the team must also gather this type of evidence to see if the score appropriately contributes to accomplishing the test’s primary purpose.
What is fitness-for-purpose evidence?
An elementary principal reminds her staff that, because they can’t see inside a child's head with a “microscope or a ritzy magnetic resonance imaging (MRI) machine,” they must use a student's overt behavior to estimate their hidden knowledge.
What is a score-based inference (or interpretation)?
A second-grade team is told that if their new reading test measures three separate skills but has a very high coefficient for this, it might be a bad sign because the items should only measure a unitary thing.
What is internal consistency reliability?
A fourth-grade teacher realizes that his student's reading performance varies naturally because of reliability wrinkles like the time of day or the student's perception regarding the importance of the task.
What are inconsistent variables?
A fourth-grade teacher explains to a parent that since test results aren’t always 100% accurate, she uses this plus-or-minus error band to show the range where a student's true score likely falls.
What is the standard error of measurement (SEM)?
During a Tier II team meeting, the academic coach noticed that a student's performance changed from a 96 to a 76. Even though the score decreased, this did not cause her concern because the student passed both times, and the final placement decision remained unchanged.
What is a decision-consistency estimate?
A school district hires this person to provide plain-language explanations of technical assessment data so that teachers and parents can use some “good old common sense” to judge a particular test.
What is a measurement specialist (or assessment consultant)?
A fifth-grade team wants to ensure their new math benchmark has this specific trait, which proves the test can actually help teachers tell the difference between well taught and poorly taught students.
What is instructional sensitivity?
A grade-level team lead cautions her PLC not to be impressed or fooled by a developer who “proudly trots out” evidence of item homogeneity as proof of a test’s quality at a district-wide professional learning conference. She points out that this kind of evidence only matters when a test is supposed to measure one skill, not several distinct cognitive abilities.
What is the misapplication of reliability evidence?
This numerical indicator, which can range from plus 1.00 to a minus 1.00, is used by teachers to represent the strength and direction of the relationship between two sets of scores.
What is a correlation coefficient?