That's a Red Flag
Methods Matchmaker
Cause or Coincidence
Measure Twice, Analyze Once
Critique This: Mini-Abstract Edition
100

A clinic sends a survey using only a patient portal, which not all patients routinely use. This limitation affects the sample that is reached.

What is coverage error?

100

A researcher quietly observes how often nurses sanitize their hands during normal work hours. This research design collects data without intervention.

What is observational research?

100

A team conducts interviews about why some patients recover more quickly than others and concludes that certain behaviors cause better outcomes, even though they only collected perceptions and experiences. However, as we know, this type of research cannot establish causation.

What is qualitative research?

100

A patient survey includes the item, ‘How frustrated are you with the wait times?’ Respondents are asked to choose how strongly they agree with that statement.This type of question can influence how people answer.

What is leading (leading question)?

100

A study compares average post-treatment pain scores between two therapy groups. To do this, the authors run a simple linear regression with group coded as 0/1 and conclude there is no difference. Students decide whether the conclusion is trustworthy and why.

What is not trustworthy due to using the incorrect analysis?

200

A glucose-monitoring study loses 38% of participants, mostly those with worsening symptoms, skewing outcomes.

What is attrition (attrition bias)?

200

A researcher wants to know whether discharge location (home vs rehab) is related to fall history (yes vs no). This type of analysis would be most appropriate to examine the relationship.

What is chi-square?

200

Patients who attended more rehab sessions improved more, but those who felt better early were more likely to keep attending. This threat to causal inference limits interpretability.

What is directionality?

200

A study compares three groups using ANOVA but incorrectly reports Cohen’s d as the effect size. This effect size is more appropriate for comparing multiple groups.

What is eta-squared?

200

A clinic conducts an observational study examining changes in daily step counts over the course of rehabilitation. Participants wore validated accelerometers, and activity was recorded continuously using the same device and procedures for everyone. The authors conclude that the step-count data accurately represent patients’ real activity levels. Students decide whether the conclusion is trustworthy and why.

What is trustworthy due to consistent measurement procedures? (or due to standardized measurement)

300

A study recruits only people who volunteer after seeing a clinic flyer, making the sample unrepresentative. This type of sampling bias threatens generalizability

What is self-selection?

300

A project examines cultural norms within one surgical team to understand error-reporting behavior.

What is ethnography?

300

Depression scores improve right after switching to telehealth, but the clinic also changed intake procedures that same week. This threat makes timing misleading.

What is history?

300

A researcher compared three treatment groups using three separate t-tests and reported two significant findings. This statistical problem over-inflates false positives

What is multiple comparisons (t-tests)?

300

A study randomly assigns patients to either standard PT or a new fatigue-management protocol. Baseline fatigue levels are equivalent, and treatment procedures are delivered consistently across clinicians. The authors conclude the new protocol leads to greater fatigue reduction. Students must judge trustworthiness and why

What is trustworthy due to random assignment (or due to quality internal validity)?

400

A clinical trial reports post-treatment results but never shows whether the treatment and control groups started out similar. This missing design feature is a major red flag.

What is baseline equivalence?

400

A researcher wants to understand what factors are associated with developing a rare postoperative complication. They recruit two groups of patients: those who experienced the complication and those who did not, then look backward through their records to compare prior exposures. This type of study design best describes the approach used.

What is case-control?

400

A study finds an association between mobility and pain and concludes that improving mobility will reduce pain, even though the study was non-experimental. This principle warns against drawing that conclusion.

What is correlation does not imply causation?

400

In a regression model examining what influences discharge readiness, age, number of comorbidities, and functional status at admission are used to estimate the outcome. These variables serve this role in a regression model.

What are predictor variables?

400

In a qualitative project exploring how patients’ confidence changes during rehabilitation, researchers conduct a single set of end-of-program interviews and conclude that confidence increases over time. Students decide whether the conclusion is trustworthy and why.

What is not trustworthy due to using the wrong qualitative temporal design?
(or due to cross-sectional design instead of longitudinal)

500

A qualitative study makes strong claims but provides no documentation of participant characteristics, no details about the setting, and no explanation of selection decisions. This undermines which component of qualitative rigor?

What is transferability? 

500

A researcher administers a symptom-severity survey by phone. Several patients ask the interviewer to repeat or rephrase items, and the interviewer occasionally explains questions in their own words to help. Responses vary noticeably depending on who conducted the call, leading to inconsistency and this type of error.

What is measurement error?

500

A study compares two treatment pathways to test whether an early-imaging approach for knee injuries causes faster return-to-activity than conservative assessment. However, patients in the imaging pathway began with different injury severity, and clinicians varied in how closely they followed the protocol. This quality of quantitative rigor is threatened, weakening the causal claim.

What is internal validity?

500

A study reports r = .50 for the relationship between strength and functional reach. The authors state that strength accounts for 50% of the performance. In order to justify that type of interpretation, the authors should have reported this statistic instead.

What is r-square (r2)?

500

A study concludes that ‘higher engagement causes better outcomes,’ but engagement was measured with a single, unvalidated item that does not fully represent the construct. Students must determine trustworthiness and the reason.

What is not trustworthy due to (poor) construct validity?

M
e
n
u