Test Construction & Interpretation
Reliability
Validity
Bias
Different goals in testing
100

Why do we convert raw scores to standard scores?

A) To make the scores more difficult to interpret.

B) To standardize scores to a specific average and spread.

C) To complicate the process of score comparison.

D) To reduce the accuracy of score interpretation.


B) To standardize scores to a specific average and spread.

100

The correlation coefficient between observed results and true results is also known as the:

A. Standard deviation

B. Reliability coefficient

C. Variance

D. Mean

Correct Answer: B. Reliability coefficient

100

What is the primary purpose of validating a test?

A) To ensure that the test is difficult enough to challenge all participants.

B) To confirm that the test accurately measures what it is intended to measure.

C) To determine the cost-effectiveness of administering the test.

D) To establish a timeline for when the test should be revised.

Correct Answer: B) To confirm that the test accurately measures what it is intended to measure.

100

What is the difference between bias and fairness: 

  1. fairness is an ethical side of testing, bias is a feature of the test

  1. fairness is an ethical side of testing, bias is a moral and legal concept 

  1. fairness is a feature of the test, bias is a moral and legal concept 

  1. fairness is an ethical side of testing, bias is a feature of the test

100

What is the primary goal of compensatory education programs? 

A) To stabilize intelligence quotients (IQs). 

B) To lower intelligence quotients (IQs). 

C) To equip children for further learning. 

D) To maintain the status quo of intelligence quotients (IQs). 

C) To equip children for further learning.

200

Which guideline should you follow to ensure clarity in exam questions? 

A) Use complex sentence structures to challenge the test-takers. 

B) Include negative words to test comprehension of double negatives. 

C) Avoid using "sometimes," "rarely," "never," or "always" to prevent ambiguity.

D) Use fancy words and phrases to enhance the difficulty level. 

C) Avoid using "sometimes," "rarely," "never," or "always" to prevent ambiguity.

200

The reliability coefficient is an index of:

A. The average of the observed results

B. The size of the relationship between the observed results and the true results

C. The range of the observed results

D. The difference between the highest and lowest observed results

Correct Answer: B. The size of the relationship between the observed results and the true results

200

Why is construct validity referred to as the "umbrella validity"?

A) Because it ensures that the test questions are comprehensive and cover all possible topics.

B) Because it covers and encompasses all other types of validity.

C) Because it protects the test from being influenced by external factors.

D) Because it provides a clear structure for organizing the test items.

Correct Answer: B) Because it covers and encompasses all other types of validity.


200

Which of the following best describes "bias" in the context of psychological testing?;

  • Errors that occur randomly and affect test scores in unpredictable ways

    Differences in test scores are due to the natural variability among individuals

    Systematic errors that consistently influence test results in favor of certain groups over others.

    Variations in test performance due to temporary conditions like fatigue or anxiety.

  1. Systematic errors that consistently influence test results in favor of certain groups over others. 


200

In the context of psychological constructs, what does "overt behavior" primarily refer to? 

A) Unconscious thoughts and feelings. 

B) Observable actions or their products. 

C) Physical characteristics. 

D) Genetic predispositions. 

B) Observable actions or their products.

300

What distinguishes local norms from nationwide norms in assessments? 

A) Local norms are based on data from diverse populations across a country. 

B) Local norms focus on comparisons within a specific geographic area or community. 

C) Nationwide norms consider regional variations in education and socioeconomic status. 

D) Nationwide norms are only applicable to urban areas, while local norms apply to rural areas. 

B) Local norms focus on comparisons within a specific geographic area or community.

300

When comparing two versions of a survey for reliability testing, both versions must be:

A. Completely different to test adaptability

B. Identical in every aspect

C. Similar enough to ensure they measure the same constructs

D. Administered to the same group of test-takers

Correct Answer: C. Similar enough to ensure they measure the same constructs

300

What does face validity primarily assess in a test?

A) Whether the test items are easy to understand.

B) Whether the test measures what it claims to measure based on its appearance.

C) Whether the test has been validated across different populations.

D) Whether the test results are consistent over time.

Correct Answer: B) Whether the test measures what it claims to measure based on its appearance.


300

What source of bias is the basis when test-takers score lower because the test is not in their native language, affecting their ability to understand the instructions and content?
 

Cultural source 

Content source 

Language source 

Administration source 

Language source

300

According to the insights from standardized intelligence tests and cumulative learning experiences, what does the concept of readiness in learning emphasize? 

A) The genetic basis of intelligence. 

B) The influence of external factors on cognitive growth. 

C) The idea that prior learning enhances future learning capabilities. 

D) The consistency of intellectual abilities as individuals age. 

C) The idea that prior learning enhances future learning capabilities.

400

In the context of a multiple-choice test, what does the "random guessing model" imply? 

A) Students select answers based on their prior knowledge. 

B) Students guess answers without any strategy or knowledge. 

C) Students collaborate to determine the correct answers. 

D) Students rely on hints provided in the questions. 

B) Students guess answers without any strategy or knowledge.

400

The most common way to divide a test for the split-half method is by:

A. Splitting the test in half randomly

B. Grouping questions by topic

C. Splitting it into odd and even questions

D. Dividing it based on test-taker performance

Correct Answer: C. Splitting it into odd and even questions

400

What does concurrent validity assess in a test?

A) Whether the test results remain consistent over repeated administrations.

B) How well the test predicts future outcomes or behaviors.

C) How well the test matches up with another measure taken at the same time.

D) Whether the test accurately measures what it claims to measure based on its appearance.

Correct Answer: C) How well the test matches up with another measure taken at the same time.


400

Which scenario best exemplifies a rating error based on the definition provided?

A) A teacher mistakenly assigns a higher grade to a student's test paper than deserved.

B) A movie critic rates a film 5 out of 5 stars, matching the general opinion of viewers.

C) A chef receives a lower rating for a dish that most diners find exceptionally delicious.

D) An athlete's performance is rated consistently across different judges' assessments.


Correct Answer:

C) A chef receives a lower rating for a dish that most diners find exceptionally delicious.

400

Which method of assessing personality involves obtaining information from individuals other than the person being assessed? 

A) Self-reporting through answering questions or filling out forms. 

B) Observational assessment by psychologists. 

C) Behavioral analysis through structured tasks. 

D) Obtaining reports from parents or teachers. 

D) Obtaining reports from parents or teachers.

500

[Normalization]
Which type of transformation is preferred when the data needs to maintain its original relative spacing and relationships? 

A) Linear transformation 

B) Nonlinear transformation 

C) Both transformations equally 

D) No transformation needed 

A) Linear transformation

500

The relationship between test length and measurement error suggests that:

A. Shorter tests are always more reliable

B. Longer tests can help reduce measurement error

C. The length of a test does not impact its reliability

D. Longer tests increase measurement error

Correct Answer: B. Longer tests can help reduce measurement error

500

Which step is essential in establishing construct validity for a test measuring soccer skills?

A) Conducting multiple trials to ensure consistent results.

B) Defining clearly what is meant by "soccer skills."

C) Comparing test scores with those of other sports.

D) Using advanced statistical techniques to analyze test data.

Correct Answer: B) Defining clearly what is meant by "soccer skills."


500

Which of the following describes a "leniency error" in the context of rating scales? 

A general reluctance to give ratings at either the positive or the negative extreme, causing ratings to cluster in the middle. 

An error that occurs due to the rater's tendency to be overly harsh in scoring, marking, or grading. 

An error in rating that arises from the tendency on the part of the rater to be overly generous in scoring, marking, or grading.

A judgment resulting from the intentional or unintentional misuse of a rating scale. 

An error in rating that arises from the tendency on the part of the rater to be overly generous in scoring, marking, or grading.

500

What does the projective hypothesis suggest about how individuals interpret unstructured stimuli? 

A) Interpretations are based on cultural factors. 

B) Interpretations are influenced by external stimuli. 

C) Interpretations are influenced by a person's unique psychological characteristics.

D) Interpretations are consistent across different individuals. 

C) Interpretations are influenced by a person's unique psychological characteristics.

M
e
n
u