Final Exam Review

Key Concepts Exam 1

Key Concepts Exam 2

Key Concepts Exam 3

Miscellaneous

100

What are the behavior, psychological construct, and inference involved in the final exam?

Behavior - respond to multiple-choice questions

Psychological construct - knowledge of psych tests and measurement

Inference - how well (or poorly) an individual understands psych tests and measurement

100

What are three measures of central tendency and three measures of variability?

Central tendency (the middle of the distribution) - mean, median, mode

Variability (how much scores differ) - range, standard deviation, variance

100

When conducting item analysis, testing professionals may examine Cronbach's alpha with item removed. What is that? When should it be examined? What might suggest you should remove an item?

What is that? Cronbach's alpha with item removed tells you the internal consistency reliability of a scale if an item is removed.

When should it be examined? When a scale is homogeneous (i.e., the items measure one underlying construct).

What might suggest you should remove an item? You may want to remove an item if Cronbach's alpha with item removed is high (well above .70). If Cronbach's alpha with an item removed is high, the scale would still have adequate internal consistency reliability without the item. Sometimes removing an item would increase Cronbach's alpha of the scale, which is a red flag for considering removal.

100

Reliability is about...

the consistency or precision of test scores

100

Validity is about...

the quality of inferences and decisions based on evidence

200

What are the 3 main characteristics/criteria of a good test?

1. the test representatively samples relevant behaviors

2. standardized testing conditions (e.g., same amount of time, same amount of resources available)

3. rules for scoring

200

What does a correlation tell you?

the direction and magnitude of variable relationships (e.g., If the correlation between conscientiousness and job performance is .3, there is a small, positive relationship between conscientiousness and job performance.)

200

What 3 questions are relevant to content validity?

Is the test content representative? Does it leave out anything important? Does it measure anything irrelevant?

200

Suppose PAR is hiring assessment psychologists based on their conscientiousness. PAR hires a psychologist who is high in conscientiousness, but he turns out to be a bad employee. He is a...

true positive, false positive, true negative, false negative

false positive

200

Someone may score poorly on this exam. Based on their score, they may think they are not very intelligent. That would be a(n)...

individual inference, individual decision, institutional inference, institutional decision, rational inference, irrational inference, rational decision, irrational decision

individual inference and an irrational inference

300

Name at least 4 criteria of a good test/survey item.

1) The item is purposeful and straightforward.

2) The item is unambiguous and uses correct syntax (e.g., avoid jargon, complete sentences, comfortable reading level, no typos).

3) The item is appropriate for the rating scale.

4) The item does not require additional categorical alternatives.

5) The item asks one and only one question (not double-barreled).

6) The item does not require reverse-coding (debated!)

300

Suppose I took two driving tests: a written test and a driving test. Which test did I do better on? (Assume this is population data.)

Written test: Score = 20 Mean = 10 SD = 5

Driving test: Score = 70 Mean = 64 SD = 2

Written test: z-score = 2

Driving test: z-score = 3

Cheryl did better on the driving test

300

What is a criterion?

An outcome we expect is associated with test scores.

For example, your performance on this jeopardy game may predict your Final exam score. Your Final Exam score would be the criterion.

300

Name at least four types of reliability.

test-retest reliability, inter-rater reliability, intra-rater reliability, parallel forms reliability, internal consistency reliability, alternate forms reliability, split-half reliability

300

The quantitative (math) section of the SAT is a(n)...

Test of maximal performance, Behavior observation test, Self-report test, Standardized test, Nonstandardized test, Objective test, Projective test, Achievement test, Aptitude test, Intelligence test, Interest test, Personality test

Test of maximal performance, standardized test, objective test (except the writing portion)

Maybe: achievement, aptitude, or intelligence (debatable!)

At one point, the SAT stood for the Scholastic Aptitude Test. Now SAT does not stand for anything because it is not necessarily a perfect measure of aptitude. Because prep courses demonstrate success, it seems to be at least partially an achievement test.

400

What are the levels of measurement?

nominal, ordinal, interval, ratio

400

What is the reliability formula specified by Classical Test Theory? Explain each component of the formula.

Observed Score = True Score + Error

X = T + E

Observed score = score a person makes on a test

True score = the score an individual would receive if they took a test an infinite number of times and computed their average score (assuming no studying in between, testing fatigue, practice effects, etc.)

Error = random error (e.g., lucky/unlucky guessing)

400

What are convergent evidence of validity and discriminant evidence of validity?

Types of validity evidence based on relations with constructs (i.e., construct validity)

Convergent evidence of validity - test scores are strongly, positively associated with scores on tests measuring similar constructs

Discriminant evidence of validity - test scores are unrelated to scores on tests measuring dissimilar constructs

400

What are the 6 assumptions of a psychological test?

1.The test measures what it claims to measure and predicts what it claims to predict

2. Test scores will typically remain stable over time (test-retest reliability)

3. Individuals understand items in the same way.

4. Individuals will report accurately.

5. Individuals will report honestly.

6. There will be some error. Observed score = True score + error

400

According to the American Psychological Association (APA), what are the 11 ethical principles relevant to assessments?

Bases of Assessments

Use of Assessments

Informed Consent in Assessments

Release of Test Data

Test Construction

Interpreting Assessment Results

Assessment by Unqualified Persons

Obsolete Tests and Outdated Test Results

Test Scoring and Interpretation Services

Explaining Assessment Results

Maintaining Test Security

500

If test scores fit a normal curve, what percentage of scores fall within one standard deviation of the mean? 2 standard deviations? 3 standard deviations?

~68%

~95%

~99.7

500

Four people take a test. The population scores are 6, 6, 14, and 14.

Calculate/identify the mean, median, mode, standard deviation, and variance.

mean = 10; median = 10; modes = 6 and 14; standard deviation = 4; variance = 16

500

What is testing bias? What causes it? And what does it lead to?

Testing bias occurs when a group (or groups) of individuals are less likely to perform well on a test for reasons that have nothing to do with the construct being measured. Bias occurs when a test requires knowledge, skills, or abilities that are irrelevant to the construct being measured (e.g., requiring high-level vocabulary on a math test; including culturally-specific knowledge on an intelligence test). Bias results in differential prediction (i.e., the same score predicts outcomes differently for certain groups).

500

Six people take a test. Their scores are 14, 8, 8, 10, 10, and 10. The test developer determines that Cronbach's alpha of the test is 0. One of the 6 test-takers scored an 8. What is the 95% confidence interval around his observed score?

The 95% confidence interval is 4 to 12.

500

Suppose you know the answers to all of the questions in this jeopardy game, and you conclude you are a genius. Thoroughly evaluate relevant evidence, and explain the quality of the inference.

Evidence based on test content: The content does not representatively capture content relevant to being a genius (it only captures content relevant to psych tests and measurement). It leaves out a lot of things that are important (e.g., verbal reasoning, spatial intelligence, problem-solving). It measures things that are irrelevant (specific knowledge about tests and measurement).

Evidence based on relations with criteria: There is no evidence to suggest that performance on this jeopardy game is associated with genius outcomes (e.g., winning a Nobel peace prize, being deemed an expert in your field).

Evidence based on relations with constructs: There is no evidence to suggest that your performance on this jeopardy game is associated with your performance on an IQ test or another measure of genius-ness.

There is no evidence to suggest you are genius. (You might be, but your score on this jeopardy game is irrelevant.)