Reliability
Validity
Surprise
Test Development
History and Related Stuff
100

What are the different types of reliability?

test-retest, alternate/parallel forms, split-half or internal reliability, inter-rater reliability.

100

What are the different types of validity? (Give at least 3 - can be a true or not true type of validity)

face, content, criterion, and construct validity

100

What reliability is not always intended to be high?

test-retest reliability

Certain constructs might change.  A personality trait is probably more stable than an emotional state.

100

What is the most common type of item format in psychology tests?

Possibly not explicitly mentioned in class, so make a good guess if you didn't encounter this in your reading.

Likert scales

100

What people, 4000 years ago or so, are credited with conducting some of the first psychological tests?

The Chinese tested people for suitability for holding some civil and leadership positions.

200

What is the Classical Test-Score Theory?

Hint: ___ Score = ___ Score - ____

True Score = Observed Score - Error

200

What is face validity? Is it useful?

"Common sense" validity, not useful because just because it looks like something doesn't mean it is valid.

In some special circumstances it might be useful (e.g. effort testing)

200

What is the minimum reliability coefficient for a scale to be considered internal consistency reliable?

alpha > .70

You might look for values above .90 in some high-stakes scenarios. (e.g. guardianship determination)

200

How difficult should items be?

Assuming this is a test with correct answers

Not too easy.  Not too hard.  If your items are of moderate difficulty you likely won't run into restricted ranges where everyone gets the item right or wrong.  This variability can help you figure out your B students from your D students perhaps.

200

What is a trait and how is it different than a state?

A trait is a "relatively enduring disposition" of an individual (e.g. pessimist vs optimist)

A state tends to vary more than a trait (e.g. Mark was in a depressive state when his goldfish passed away... He was better later on that day after being able to adopt 3 cichlid fish)

300

What are examples of sources of error? Give at least 3 different examples!

test itself, test-taker, environment, how the test was scored

300

How do you establish content validity?

Conduct literature review on topic to make sure you are covering all facets of it.  Also, make sure you aren't putting in irrelevant garbage in your test!

300

What is the standard error of measurement AND how does it relate to confidence intervals?

It measures how close an observed scores are to true scores.  It can be multiplied by a z-score value to allow us to calculate appropriate confidence intervals.  

300

What is Item Difficulty?

% of people who get the item correct

300

What is mental age AND which test introduced this concept?

This was present on the Binet-Simon Scale of 1908.  This metric compared a participant's performance to the typical age of people with this performance (e.g. This child is 12, but he solved as many block puzzles as a typical 15 year-old, so his mental age is 15).

400

What does r2 = .60 mean in terms of variation in test scores?

Hint: _____ percent of variation is test scores is due to _____

60% of the variation in test scores is due to variation in our test scores (the other 40% of variation is error or explained by something else).

400

What is the difference between norm-referenced test and criterion-referenced test?

Norm-referenced = Hire the top 5% test scores. Criterion-referenced = Hire those who can do "specific skill."

400

What are the characteristics of a "good" criterion?

relevant (relevant to the construct) and valid (evidence must exist that criterion is already valid)

400

What are three elements of a good multiple choice questions?

1. Has grammatically parallel distractors. 

2. no "cute" distractors. 

3. has one correct target. 

4. not too many distractors. 

5. no double-negatives wording.

400

What is the difference between objective (structured) and projective tests?  Also, give an example of each.

Objective - forced responses lead to almost no need for interpretation during scoring. One example - Stanford Binet IQ test

Projective - often open-ended responses require lots of examiner interpretation. One example: Rorschach inkblot test

500

What is the z-score to use for building a 95% confidence interval?

Hint: This is a tough 500pt question, but it was a number we multiplied our standard error of the measure by on Monday.

Z = 1.96

500

What is convergent and divergent evidence/validity?

Convergent = data from multiple sources all tend to point to the same conclusion. 

Divergent = test scores do not correlate with scores on other measures it should theoretically not relate to.

500

In factor analysis, what is KMO and what is Bartlett's test?

These were presented together, so you don't have to remember what specific definition matched each term.  However, say the 2 things these tests do together.

Kaiser-Meyer-Olkin (KMO) Test is a measure of how suited your data is for factor analysis - Is your sample adequate?  If so, value above 0.5 should result

Bartlett's test is a measure of how suitable your data is for factor analysis - will factors be found?  You want a value under 0.05 to move forward.


500

Who is credited with the term "mental test"?

US psychologist James McKeen Cattell