(Validity)
Reliability
Vocabulary
Assessment & Testing
Scoring
100

Predictive Validity

The predictions made by the test are confirmed by later behavior

Example: SAT scores are used to predict a college students first year GPA (forecast a future outcome)


100

A measure of reliability useful for interpreting the test scores of an individual, helps determine the range within which an individuals test score probably falls (precision of the test)


Standard Error of Measurement 

100

The four measures of variability

Range-  Highest score minus the lowest score in a set

Inclusive Range- which is the high score minus the low score & adding 1 (inclusive-everyones included)

Standard Deviation- Standard Deviation

Variance- the square of the standard deviation 

100

This type of test measures the effects of learning or a set of experiences (What someone knows) 

Example: Standardized testing 

Achievement Test

100

Percentile 

A value below which a specified percentage of cases fall

example: If a student scores in the 90th percentile this means that their score was higher than 90% of the other students who took the test 

200

Concurrent Validity

The results of the test are compared with other tests results (could be past or present) 

Examples: Comparing student past test scores in a class to the current class test scores

200

A test used to see how relaible the test would be if not cut into two. Reducing the length of the test in turn reduce the measured reliability, this formula accounts for that

Spearman-Brown formula 

200

Standard Deviation

example: two classes of students taking a test. 

Class A: The average score is 75, and the standard deviation is 5. This means most students scored between 70 and 80 (75 +/- 5). 

Class B: The average score is also 75, but the standard deviation is 20. This means the scores are more spread out, with some students scoring much higher or lower than 75

200

This type of test is often referred to as a ability test, this type of test measures the effects of general learning and are used to predict future performance 

Example: Career Ability Placement Survey 

Aptitude test

200

The function of a normal curve 

The normal curve distributes the scores into six equal equal parts, three above the mean and three below the mean(is a symmetrical distribution of scores with an equal number of scores above and below the midpoint)

300

Construct Validity

Testing that has validity to the extent it measures some hypothetical construct. 

examples: 

300

This type of reliability uses the same instrument on both occasions, you are testing the same group twice

Stability 

300

Correlation Coefficient 

statistical index that shows the relationship between two sets of numbers (range from -1.00 to +1.00) -1.00 is a perfect negative correlation, +1.00 is a perfect positive correlation (does not determine or tell you anything about cause and effect 

300

The three types of "referencing" in assessment 

Norm Referenced- comparing individuals to others (CPCE, NCE), compainr to others is more important than what you know 

Criterion Referenced-comparing an individuals performance to predetermined criterion (cut off score) 

examples: Drivers exam, theres an established cut off score 

Ipsatively interpreted- compares the results on he test within the individual 

examples: comparing the scores of how an individual did on his first exam and second exam 

300

Stanine 

converts a distribution of scores into nine points (1 to 9 with 5 being in the middle. "Standard Nine",  used to categorize student performance and provide a normalized way to compare scores, 

400

Content Validity

The extent to which a measurement instrument (like a test or questionnaire) accurately and adequately measures the specific content or construct it is designed to assess.

Example: creating a test to measure someone's knowledge of basic arithmetic operations, content validity would ensure that the test questions cover all the essential topics, such as addition, subtraction, multiplication, and division, and do not include irrelevant mathematical


400

alternate forms of the same test are administered to the same group group and the correlation between the two is calculated (test the tests same skills, just different questions)

equivalence 

400

The Coefficient of determination/ Coefficient of nondetermination

The coefficient of determination represents the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. Conversely, the coefficient of nondetermination represents the proportion of variance that is not explained by the model(represents the error variance) 

400

Tests are typically considered :

Power based or Speed based -

Power based- no time limits or very generous ones 

Speed based- timed, emphasis is placed on speed & accuracy.

Examples: 

400

Z Score 

Indicates how many standard deviations a data point is away from the mean of the distribution

example: if a test score of 87 has a z-score of 1.75 when the mean is 80 and the standard deviation is 4, it means the score is 1.75 standard deviations above the mean

500

Face validity 

refers to the degree to which a test or measurement appears, on the surface, to measure what it is intended to measure

example: if you create a survey to measure customer satisfaction and it includes questions about product usage and service experiences, people might say it has high face validity because the questions seem to relate directly to what you're trying to measure.

500

split half method" , you are splitting the test into two halves and correlating the between the two halves

Internal Consistency 

500

Semantic Differential 

A scale that asks individuals to report where they are on a dichotomous stage.

Example. A scale with one end that says "very good" and another that says "very bad" 

500

These tests present a relatively unstructured task or stimulus. The person projects thought processes, needs, anxieties.

Example: The Rorschach ink blot test

Projective test 

500

T- Score 

is a standardized test like a Z- Score, but the population standard deviation is unknown and estimated from sample standard deviation 
M
e
n
u