When a test appears to respondents or other stakeholders, to be a measuring what it purports to measure.
What is face validity?
100
The extent to which the measurement procedure is accurate and precise. Can be expressed as SEM, Cronbach’s alpha, etc.
What is reliability?
100
A teacher used test results to inform her how students mastered what she taught in the course of instruction.
What is “Formative evaluation”?
100
What type of item measures knowledge and is well suited for measuring associations between facts?
What are matching questions?
100
An aspect of validity where scores are related to other similar or different measures.
What is external validity?
200
An explicit plan that guides test construction.
What is a test blueprint?
200
Errors could come from content sampling, time sampling, and inter-rater differences.
What are the main factors that make test scores unreliable?
200
At the end of high school, students were given a comprehensive test on all disciplines.
What is “Summative evaluation”?
200
A round green vegetable that is often left on plates at the dinner table or the ratio of the number of examinees who got the item correct and the total number of examinees.
What is p or p-value?
200
Content, substantive, structural, generalizability, external, and consequential.
What is the unified concept of validity?
300
Content, Criterion, and Constructed-related
What is the "holy trinity" of validity?
300
Group variability, test length, raters and judges, level of trait, and inconsistency of examiners.
What are factors affecting reliability?
300
SAT/GRE, Stanines, NCE scores
What are “Examples of normalized standard scores”?
300
When designing these types of questions, the researcher should make all of the response options grammatically consistent with the stem and parallel in form.
What is a multiple choice item?
300
This aspect of construct validity provides evidence for evaluating both intended and unintended consequences of score interpretation and use in short and long term.
What is consequential aspect of construct validity?
400
One example is when colleges use student SAT scores as a factor in admitting students.
What is criterion-related predictive validity?
400
95% of the time, Yun’s true score falls between 20.23 and 25.67.
What is a confidence interval?
400
50% of scores are at or below this score.
What is “The median”?
400
What was the cause of WWI?
What is a poorly worded test question?
400
This cross table consists of four components of test interpretation, test use, evidence basis, and consequential basis.
What is progressive matrix?
500
Correlations between measures of the same trait measured using different methods in the Multitrait-Multimethod Matrix.
What is the validity diagonal, or what is monotrait heteromethod?
500
No tests are perfectly reliable, but this equation helps us find the correlation between two tests, assuming they are perfectly reliable...
What is the correction for attenuation?
500
For this score type, the amount of growth is assumed to be equal from year to year.
What is “developmental standard score”?
500
Which item writing rule does the following question violate: Why should researchers not remain objective?
What is avoiding negative statements?
500
Messick defines this in a broader sense that subsumes both qualitative and quantitative summaries of observed consistencies or performance regularities.