Generalizability
Item Response Theory
Item Difficulty/Discrimination
Standard Setting
Norming
100
The populations of measurements that would be obtained by measuring under all conditions
What is the universe of generalization?
100
Plots the probability of responding correctly to an item as a function of the trait underlying performance.
What is an item characteristic curve?
100
How well the examinees did on a question.
What is item difficulty?
100
Three ways that standard setting can be done.
What is Holistic impression, Individual Test Items, and Performance of Examinees?
100
Why do we have different scoring measurements/methods (Z-scores, Grade equivalence, etc)?
It is easier for the normal population to understand and accept.
200
Measures reliability
What is a Generalizability Study
200
This accounts for the statistical dependence among questions.
What is a latent trait?
200
This formula is used to calculate item discrimination.
What is D=Pu-PL?
200
He was responsible for the contrasting groups method and what other method.
Who is Nedelsky? Borderline group method.
200
Should test revision be based on item analysis?
This should only be done for future versions of the tests not on the current version that has already been administered.
300
Compares how well one condition performs compared to another condition.
What is a decision study.
300
Theta prime denotes this.
What is used to show a minimum latent trait score and if the examinees are below this score they can not answer the question correctly.
300
This adjusts item difficulty based on the idea that examinees will eliminate some distractors even if they don't know the answer to the question.
What is Lord's adjustment?
300
This is learned from Mills study about standard setting?.
We should use at least three methods when standard setting because at least two methods will converge.
300
Which type of sampling do use whenever variance within a particular subgroup is less then total group variance?
Stratified random sampling.
400
the investigator considers the conditions in the D-study to be a sample from a larger number of conditions and intends to generalize to all these facets
What is a Random Facet
400
The slope of the ICC curve tells you this.
What is item discrimination?
400
This statistic should be used when describing how an examinee did on several tests of different lengths.
What is item difficulty?
400
When a judge assigns a high probability of minimally capable examinees getting a question right on a hard questions and a lower probability of the examinees getting the question right on an easier question.
What is intrajudge inconsistency?
400
What are the issues with using percentile rank?
Scores are less reliable for scores in the central part of the distribution than for those in the extremes Gains or losses of percentile ranks for individual examinees cannot be meaningfully compared for examinees at different points in the distribution Arithmetic and statistical computations of percentile rank scores cannot be meaningfully interpreted in some situations
500
Each examine is rated by a different rater; there is only one rater for each examinee.
What is Design 3 or a Nest design
500
This is why unidimensionality does not equal local independence.
What is does it mean that Unidmensionality is related to the number of latent traits on a test and you can never say for sure that a single latent trait exists such that all items are locally independent.
500
Can determine the degree of stability in responses to the same dichotomous item but different examinees on different occasions.
What is Phi?
500
The strategies for setting cutoff scores.
What is Minimizing probabilities of misclassification, Minimax procedure, and Minimizing the expected cost of misclassification
500
What is standard error of the measurement for the normal distribution of several simple random samples?
Standard deviation