A Matter of Measurement
Validity Villains
Functional Relation Station
The Right Tool for the Job
Quality Control
100

A behavior analyst records the number of times a target behavior occurs per hour. 

What is Rate?

100

This villain creeps in slowly over long baselines or interventions, making behavior improve (or decline) simply because the learner is growing, developing, or getting tired, not because of the intervention.

What is maturation?

100

This train makes four scheduled stops (Baseline, Intervention, Baseline, Intervention) allowing riders to see whether behavior reliably changes every time the IV comes back into the station.

What is a withdrawal (or ABAB) design?

100

This design allows you to evaluate an intervention without ever removing it, by staggering start times across participants, settings, or behaviors.

What is a multiple baseline design?

100

This type of validity asks whether the independent variable (not some sneaky confound) is responsible for changes in the dependent variable.

What is internal validity? 

200

A BCBA starts a stopwatch when she gives an instruction and stops it when the student begins the assigned task to determine how quickly the student responds.  

What is Latency?

200

This villain strikes when the order of conditions influences behavior, so responding in Condition B is affected simply because it followed Condition A.

What are Sequence Effects (or Carry Over Effects)?

200

This express line switches tracks rapidly, comparing two or more interventions delivered in quick succession to see which one gets riders to improved behavior fastest.

What is alternating treatment design?

200

A researcher wants to evaluate the effects of an intervention for three behaviors but can only collect probes occasionally. This design allows experimental control without continuous baseline data.

What is a Multiple Probe Design?

200

This measure checks whether two observers recorded behavior consistently and independently.

What is interobserver agreement? 

300

A BCBA measures how long a student remains seated during independent work to evaluate whether a new reinforcement system increases sustained engagement.

What is Duration? 

300

This villain strikes when multiple interventions lurk too close together, tangling their effects so badly that you can’t tell which treatment is causing the behavior change.

What is Multiple Treatment Interference?

300

This reliable engine strengthens your evidence by repeatedly demonstrating the effect, showing the change wasn’t just a one-time fluke. Usually, you need at least 3 of these to demonstrate a functional relation.

What are replications? 

300

A clinician wants to know which of two communication systems a child prefers when both options are available simultaneously during every session?

What is a Simultaneous Treatments Design?

300

This type of validity looks beyond the study to ask whether results generalize to new people, settings, materials, or times.

What is external validity? 

400

During recess, an observer marks whether a child is engaged in play at each 10-second interval to estimate the proportion of time the child is active, is an example of this type of measurement system. 

What is momentary time sampling?

400

This villain arises when the person collecting data expects a particular outcome and the data end up reflecting what they hope to see rather than what actually happened.  

What is expectancy (or observer) bias? 

400

This train doesn’t jump between stations—it gradually raises or lowers performance goals, showing a functional relation when behavior climbs (or descends) step-by-step with each new target.

What is a Changing Criterion Design? 
400

If the target behavior is not reversible, this type of design is not an option. 

What is an ABAB (or withdrawal) design? 

400

Parents, teachers, or participants rate whether an intervention is acceptable, practical, and likely to be used.

What is social validity? 

500

A supervisor counts how many steps of a 12-step handwashing routine a client completes independently during each trial.

 What is a task analysis with percent-correct measurement?

500

This sneaky villain slowly pulls observers off course over time, causing data collection to become increasingly inaccurate as the clear operational definition they started with is no longer being used accurately

What is Observer drift?

500

This train goes further than expected and involves testing whether behavior improvements continue across new settings, new people, or new materials that weren't involved in the initial intervention. 

What is generalization? 

500

This type of design is typically used to display the results of a Functional Analysis. 

What is a Multielement Design?

500

This term refers to providing a clear, observable, measurable description of a behavior so anyone can score it accurately.

What is an operational definition?