Foundation of Assessment Final

Generalizability

Item Response Theory

Item Difficulty/Discrimination

Standard Setting

Norming

100

The populations of measurements that would be obtained by measuring under all conditions

What is the universe of generalization?

100

Plots the probability of responding correctly to an item as a function of the trait underlying performance.

What is an item characteristic curve?

100

How well the examinees did on a question.

What is item difficulty?

100

Three ways that standard setting can be done.

What is Holistic impression, Individual Test Items, and Performance of Examinees?

100

Why do we have different scoring measurements/methods (Z-scores, Grade equivalence, etc)?

It is easier for the normal population to understand and accept.

200

Measures reliability

What is a Generalizability Study

200

This accounts for the statistical dependence among questions.

What is a latent trait?

200

This formula is used to calculate item discrimination.

What is D=Pu-PL?

200

He was responsible for the contrasting groups method and what other method.

Who is Nedelsky? Borderline group method.

200

Should test revision be based on item analysis?

This should only be done for future versions of the tests not on the current version that has already been administered.

300

Compares how well one condition performs compared to another condition.

What is a decision study.

300

Theta prime denotes this.

What is used to show a minimum latent trait score and if the examinees are below this score they can not answer the question correctly.

300

This adjusts item difficulty based on the idea that examinees will eliminate some distractors even if they don't know the answer to the question.

What is Lord's adjustment?

300

This is learned from Mills study about standard setting?.

We should use at least three methods when standard setting because at least two methods will converge.

300

Which type of sampling do use whenever variance within a particular subgroup is less then total group variance?

Stratified random sampling.

400

the investigator considers the conditions in the D-study to be a sample from a larger number of conditions and intends to generalize to all these facets

What is a Random Facet

400

The slope of the ICC curve tells you this.

What is item discrimination?

400

This statistic should be used when describing how an examinee did on several tests of different lengths.

What is item difficulty?

400

When a judge assigns a high probability of minimally capable examinees getting a question right on a hard questions and a lower probability of the examinees getting the question right on an easier question.

What is intrajudge inconsistency?

400

What are the issues with using percentile rank?

Scores are less reliable for scores in the central part of the distribution than for those in the extremes Gains or losses of percentile ranks for individual examinees cannot be meaningfully compared for examinees at different points in the distribution Arithmetic and statistical computations of percentile rank scores cannot be meaningfully interpreted in some situations

500

Each examine is rated by a different rater; there is only one rater for each examinee.

What is Design 3 or a Nest design

500

This is why unidimensionality does not equal local independence.

What is does it mean that Unidmensionality is related to the number of latent traits on a test and you can never say for sure that a single latent trait exists such that all items are locally independent.

500

Can determine the degree of stability in responses to the same dichotomous item but different examinees on different occasions.

What is Phi?

500

The strategies for setting cutoff scores.

What is Minimizing probabilities of misclassification, Minimax procedure, and Minimizing the expected cost of misclassification

500

What is standard error of the measurement for the normal distribution of several simple random samples?

Standard deviation