Statistical Knowledge for Psychological Testing Interpretation

Measures of Central Tendency & Variability

Psychometric Properties

Slightly Advanced Statistics

Surprise

Psychometric Properties Part 2

100

The value that appears most often in a data set

Mode

100

Type of criterion-related validity that refers to how accurately a person's current test score can be used to estimate what the criterion score will be at a later date. Prediction about a person's performance can be made in another place or with another task.

Predictive Validity

100

A bell-shaped curve which shows the probability distribution of a continuous random variable. The total area under the ___ logically represents the sum of all probabilities for a random variable.

Normal Distribution/Curve

100

Type of test that measures a person's performance in comparison to the performance of a larger group; usually a statistically selected group of test takers, typically of the same age or grade level, who have already taken the test

Norm-referenced tests

100

Type of test reliability where results are consistent when more than one person rates the test.

Inter-rater Reliability

200

The difference between the lowest value and the highest value

Range

200

The degree to which an instrument measures the constructs it intends to measure

Validity

200

A type of calculated standardized score which helps you compare scores easily. A score of 50 represents the mean. A difference of 10 from the mean indicates a difference of one standard deviation.

T-Scores

200

A test that measures a person's development of skills or performance in terms of absolute levels of mastery. A criterion level to which a person is compared.

Criterion-Referenced Tests

200

Characteristic of a test or measure having to do with replicability.

Reliability

300

The middle number in a list of numbers ordered from lowest to hightest

Median

300

the extent to which a test is subjectively viewed as covering the concept it intends to measure; the transparency or relevance of a test as it appears to test participants

Face Validity

300

a range of values so defined that there is a specified probability (90-95% is gold standard) that the true value lies within it

Confidence Intervals

300

A standard score that has a mean of "0" and a standard deviation of 1.

Z-Score

300

Measure of reliability gained by comparing scores from one half of the test with scores from the other half. Ex: Odd items compared with even items.

Split-test Reliability

400

the total of all the values, divided by the number of values

Mean

400

the degree in which the scores on a measurement are related to other scores on other measurements that have already been established as valid

Concurrent Validity

400

The percentage of scores in its frequency distribution that are equal to or lower than it. For example, a test score that is greater than 75% of the scores of people taking the test is said to be at the 75th ___ .

Percentile

400

Derived score that is standardized by transforming scores in distribution so that the mean and standard deviation take predetermined values. Transforms the raw score into sets of scores that have the same meaning and standard deviation.

Standardized Score

400

Determined by administering a test (Test A) to a group and then administering a parallel form of the test (Test B) to the same group.

Parallel Form Reliability

500

A measure of the amount of variation or dispersion of a set of values. On the low side, it indicates that the values tend to be close to the mean of the set, while on the high side, it indicates that the values are spread out over a wider range.

Standard Deviation

500

the extent to which the underlying supposed structure of a scale is recoverable in a set of test scores; determined via factor analysis (structural equation modeling)

Factorial Validity

500

A measure of how much measured test scores are spread around a “true” score; directly related to a test’s reliability: The larger the ___, the lower the test’s reliability; Used to calculate confidence intervals

Standard Error of Measurement (SEm)

500

Original data that has not been transformed; Without knowing how many questions were on the test or the point value of each question, ___ are impossible to decipher in terms of percentile, grade, or measured progress.

Raw Scores

500

Extent to which a test can be generalized to different times. Expectation that the same score would be attained by a single subject when that subject is given the same test at 2 different times.

Test-Reliability Reliability