Miscellaneous
Calculations
Study Design
Interpretations
Bias
100

Name one way to prevent confounding in the design phase of a study?

Randomization, restriction, matching


EXTRA: What are ways to adjust for confounding in the analysis phase?

Stratified analysis, multivariable analysis, standardization

100

Your friend runs a logistical regression and finds a coefficient of 3.70 using the summary() command in R. They tell you “the odds ratio is 3.70!” What key step did they forget?? What is the odds ratio?

Exponentiate! 

OR = e^3.70 = 40.44

100

In 2001, a study examined the physical examination records of the entire incoming freshman class at University of Minnesota and gave questionnaires in order to assess whether their recorded blood pressures were associated with stress. What type of study is this?

A cross-sectional study

100

Describe the difference between clinical significance and statistical significance. 

Statistical significance pertains to whether a result is mathematically significant according to a given p-value (is our result more extreme than what we would expect if the null hypothesis were true?).


Clinical significance pertains to whether a given result is meaningfully in practice, relates to the effect size.

100

Can we correct bias by increasing sample size?

No!

Unlike confounding (which can be controlled for in the analysis), bias CANNOT be fixed (only prevented). Systemic error is NOT dependent on sample size (will not decrease with larger sample size).

200

What is a 95% confidence interval?

A 95% confidence interval means that if we were to take 100 different samples and compute a 95% confidence interval for each sample, then approximately 95 of the 100 confidence intervals will contain the true mean value (μ).

200

The study investigators report a crude odds ratio of 2.29 (95% CI 1.40-3.75) for the association between history of incarceration and development of heart disease. After adjusting for the risk factors listed in Table 1, the adjusted odds ratio was 3.46 (95% CI 2.04-5.88). Calculate the magnitude of confounding by this collection of risk factors (covariates). Is there joint confounding present by this collection of risk factors?

Magnitude of confounding = ORcrude – ORadjusted / ORadjusted * 100% 

= [(2.29 – 3.46) / 3.46] * 100% = -33.8%


|-33.8%| > 10%

Yes, joint confounding is present.


200

What statistical test would you use to test the association between a categorical exposure and categorical outcome?

Chi-squared test

200

Use the following R output to write the equation predicting birth weight (birth_wt) using gestational age (gest_age).

> reg<-lm(birth_wt~gest_age)

> summary(reg)

Call:

lm(formula = birth_wt ~ gest_age)

Residuals:

   Min     1Q Median     3Q    Max 

-958.6 -157.2    7.7  201.0  667.8 

Coefficients:

            Estimate Std. Error t value Pr(>|t|)    

(Intercept) -4020.05    1263.05  -3.183  0.00618 ** 

gest_age      180.46      32.82        5.498 6.13e-05 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 414.4 on 15 degrees of freedom

Multiple R-squared:  0.6683,    Adjusted R-squared:  0.6462 

F-statistic: 30.23 on 1 and 15 DF,  p-value: 6.13e-05


birth_wt = -4020.05 + 180.46*gest_age

200

What type of bias occurs when the probability of being included in the study is related to both the exposure and outcome?

(Differential) Selection bias

300

What are the 3 properties of a confounder?

Confounding is a distortion of a measure of association that occurs when a risk factor for the outcome is not evenly distributed between exposure groups.


Three Properties of a Confounder:

(1) Predictor of the outcome in unexposed

(2) Associated with the exposure

(3) NOT an intermediate on the causal pathway between exposure and disease

300

Between 2011-2014, newborns of individuals who used illicit drugs during pregnancy had 2.59 times the odds of developing gastroschisis compared to newborns of individuals who did not use illicit drugs. 

How would you interpret this odds ratio as a percent?

159% increased odds among those in the exposed group

300

What type of study uses an odds ratio (OR) as the measure of association?

Case-control study

300

Identify what is WRONG with the following interpretation.


WRONG: In a population of women living in Kenya in 2016, the risk of ovarian cancer among those who used hormone therapy was 1.4 times higher than the risk those who didn't use hormone therapy.

Remove 'higher' from interpretation.


CORRECT: In a population of women living in Kenya in 2016, the risk of ovarian cancer among those who used hormone therapy was 1.4 times the risk of those who didn't use hormone therapy.

300

Describe 3 ways to prevent interviewer bias.

Masking/blinding

Use quality questionnaires

Train interviewers

400

Define effect measure modification (EMM).

A phenomenon where the strength of association varies according to the level of a 3rd variable (interaction). NOT A TYPE OF ERROR OR BIAS.

EMM is identified by comparing stratum-specific estimates (eyeball test).

400

This data comes from a cross-sectional study on the association between smoking and heart disease. What is the appropriate measure of association? What does it equal?

                Smoker    N Smoker    TOTAL

Heart Dx     350          500            850

No Dx        6650        11500        18150

TOTAL        7000        12000        19000

Prevalence Ratio (PR) = (350/7,000) / (500/12,000) = 1.2

BONUS extra 100 points: Can you interpret this measure?

400

Match the appropriate statistical test to test the association between each combination of exposure and outcome variable types.

Statistical tests

(1) Chi-squared test

(2) Linear regression

(3) Logistic regression


Exposure & Outcome Combinations

(A) Exposure = categorical , Outcome = continuous

(B) Exposure = categorical, Outcome = categorical

(C) Outcome = continuous, Exposure = dichotomous

(A -- 2) Linear regression for a categorical exposure and a continuous outcome

(B -- 1) Use a Chi-squared test for a categorical exposure and a categorical outcome

(C -- 3) Use a Logistic regression for a continuous outcome and dichotomous exposure

400
  • Investigators are studying the association between CD4 count (cells per microliter) and whether or not patients use vitamin supplementation among Massachusetts senior citizens in 2022. They also collect data on coinfection with other infectious diseases. Their analysis produces the following model:

CD4 = 501.41 + 12.67 * Supplements – 30.23 * Coinfection

Interpret the slope of “Supplements” 

Among Massachusetts senior citizens in 2022, those who use vitamin supplementation had mean CD4 counts 12.67 cells per microliter greater than those who did not use supplements, after adjusting for coinfection with other infectious diseases.

400

The true risk ratio was 2.0 while the observed risk ratio was 1.0. The investigators were concerned that loss to follow-up may have impacted the study results. Assuming they were correct, and that those lost to follow-up in each arm had similar demographic and baseline characteristics, what impact would this bias have on the observed study results (including the direction of the bias)?

Towards the null; downwards

500

Match the statistical hypotheses to the test.

Hypotheses

(1) H0: µ=100 / HA: µ≠100

(2) H0: µd=0 / HA: µd≠0

(3) H0: µ=100 / HA: µ>100


Tests

(A) Paired t-test, 2-tailed

(B) One sample t-test, 1-tailed

(C) One sample t-test, 2-tailed

(1 -- C) H0: µ=100 / HA: µ≠100 -- One sample t-test, 2-tailed

(2 -- A) H0: µd=0 / HA: µd≠0 -- Paired t-test, 2-tailed

(3 -- B) H0: µ=100 / HA: µ>100 -- One sample t-test, 1-tailed

500

Investigators are studying the association between systolic blood pressure and miles walked per day, adjusting for heart conditions. Their analysis produces the following model:

Systolic BP = - 0.1 * Miles + 20 * Heart Disease + 120

What is the expected systolic blood pressure for a person who walks 3 miles per day and has no history of heart disease?

Systolic BP = -0.1(3) +20(0) + 120 = 119.7

500

What are the key principles that must be followed when selecting controls for a case-control study?

  1. The comparison group ("controls") should be representative of the source population that produced the cases.
  2. The "controls" must be sampled in a way that is independent of the exposure, meaning that their selection should not be more (or less) likely if they have the exposure of interest.
500

Match the interpretation to the model:

1. “The odds of {outcome} are {e^slope} times higher in {exposed} compared to {unexposed}”

2. “For every 1 unit increase in {exposure}, {outcome} is expected to increase by {slope}”

3. “The mean {outcome} is {slope} higher in {exposed} compared to {unexposed}”

4. “For every 1 unit increase in {exposure}, the odds of {outcome} are expected to increase by {e^slope}%”

    

A. Continuous exposure, linear regression            

B. Dichotomous exposure, linear regression    

C. Continuous exposure, logistic regression

D. Dichotomous exposure, logistic regression


(1 -- B) “The odds of {outcome} are {e^slope} times higher in {exposed} compared to {unexposed}” = Dichotomous exposure, linear regression

(2 -- C) “For every 1 unit increase in {exposure}, {outcome} is expected to increase by {slope}” = Continuous exposure, logistic regression

(3 -- D) “The mean {outcome} is {slope} higher in {exposed} compared to {unexposed}” = Dichotomous exposure, logistic regression

(4 -- A) “For every 1 unit increase in {exposure}, the odds of {outcome} are expected to increase by {e^slope}%” = Continuous exposure, linear regression

500

A prospective cohort study was conducted to determine the risk of heart attack among men with varying levels of baldness. Third-year residents in dermatology conducted visual baldness assessments at the start of the study (which was before any heart attacks had occurred). Four levels of baldness were coded: none, minimal, moderate, and severe. The follow-up rate was close to 100%. Which of the following types of bias were surely avoided in this study?

1. Recall bias of exposure information

2. Differential misclassification of exposure

3. Non-differential misclassification of exposure

4. Selection bias (because follow up was close to 100%)

No recall bias, no differential misclassification of exposure, no selection bias. There may, however, have been non-differential misclassification of exposure.

M
e
n
u