Statistics I & II Jeopardy Template

Uni/Bivariate

General Stats

Contrast/Regression

Regression II

Mediation & Moderation

100

What is central tendency? What is Variance, and How do they explain a single variable?

Central tendency is the mean/median/mode. Variance is average of squared differences from the mean in the data set. It explains how the data differs from it's average. The Standard Deviation is the Sq Root of the variance.

100

In the SO (statistical outcome) drawing, what are the three key componentsw and how are they described?

Likelihood (probablity) aka P values based on hypothesis testing. Magnititude (effect size) indicating the strength of the finding, and Precision AKA Confidence intervals provide a range of values within which we can expect the true population parameter to fall, based on our sample estimate and the associated t-value.

100

What are important aspects of contrast testing?

Allow weighted, focused look, at within/between or both score differences. the sum of C (contrast weights) should = 0. Contrast provides focused results vs. ANOVA which is an omnibus of overall group differences and CHI-Square (difuse, goodness of fit) provides categorical differences.

100

What is the key difference between part and partial and why are they important in multiple regression?

Partial correlation controls for other variables in both the predictor and outcome, while part (semipartial) correlation controls for other variables only in the predictor, making it better for assessing unique variance explained in regression. They let us know how much each of our predictors account in our model for the regression.

100

What regressions need to be performed in a mediation?

In a mediation, we are attempting to see how one variable increases the effect between two variables as a mediating path. We regress predictor on outcome, predictor on mediator, mediator on outcome, and the predictor on outcome with mediator removed.

200

Explain Stem and Leaf and Box Plots.

Stem and Leaf is a visual representation of scores broken into first digit (stem) and second digit (leaf) that shows frequencies of occurrences and has stem widths to describe the size of the numbers. Box Plots is a box around the median (center) 75th (upper) 25th (lower) with the wiskers at the top and bottom indicitating where the outliers would be.

200

What are some key flaws to Hypothesis tests, how does this relate to the Popper Loop and the Lucky / Unlucky example from class?

Key flaws are, real world differences are rarely 0, the null rejection is only based on this not that the hypothesis is true. The popper loop describes that we can't definitively prove a hypothesis only continue to refine and not disconfirm it. In unlucky and Lucky two studies but only one got p<.05 but taken together they both would be .05 may relate more to N sizes.

200

In a between within contrast, what steps do we take to calculate our contrast?

After running a within contrast (to obtain the contrasted scores for each group) we then have to run a group to group contrast to study the differences.

200

In regression diagnostics, what are some key assumptions?

(1) No specification error (3 types; all relationships are linear, no relevant predictors omitted from model, no irrelevant predictors included in model), (2) no measurement error, (3) homoscedasticity

200

For moderation, what is happening and what regressions need to be performed.

Moderation is seeing how an interaction affects the predicition in a model. After calculating the interaction (X * M) we then regress the interaction and the two variables. If the moderator is a continous variable this would be the last step, if they are categorical, we can then sort by regressing the two left from the model by the category (I.E. Male only)

300

What are covariance and standarized covariance and how do they differ?

Covariance is the sum of the cross-products after subtracting their means (xy) / n and expresses the linearity of the relationship. It IS scale-dependent. Standarized Covariance (AKA PEARSONS R) is a correlation and is scale-free and requires standardizing the scores (z scores). Can be gotten by dividing the sum of the covariances by the sum of the two variables SD's.

300

What are the typical steps in hypothesis testing?

Articulate the null (typically set to 0) calculate the T-Value (statistic - Value of paremter by H0 / Standard Error) Identify the underlying population paramter (if uknown can use cummins caluclator and df to assert P value from T score) Confidence intervals around the scale dependent effects give precise range of plausable values for the parameter.

300

What is the difference between standarized and undstandardized coefficients in regressions?

Unstandardized (b) is the change in y for one change in x based on original units, a real world comparison. Standarized (B) or BETA is reflecting the Z of y and the Z of X and provides a scaled comparison of the model.

300

What are two critical components to assess for multicollinearity and why should we look at this.

Multicollinearity occurs when two or more predictors in a regression model are highly correlated with each other, meaning they contain overlapping or redundant information. Two critical indicators are TOL or tolerence (should be <.2) and VIF variance inflation (should be under 5).

300

When would we want to use a logistic regression?

Logistic regressions provide probability of an outcome when that outcome is categorical or binary (pass fail etc..) where as regression provides a linear model for predicting continous variables.

400

What does a bivariate scatterplot demonstrate, and where are the x and y axis typically?

A bivariate plot graphs the relationship between two variables measured on a single sample of subjects allowing you to see the degree and pattern of relation between the two variables quickly. The X (aka abscissa or predictor) is the horizontal (left right) and the Y (AKA ordinate or outcome) is the vertical (up down)

400

Outside of the contrast model, describe the difference in process when conducting a standard assessment of between vs. within.

In standard between we are looking at mean differences between groups and T-testing to obtain significance. In standard within, (D-BAR) we are obtaining a mean from the difference between each subjects score to test for significance.

400

What is the difference between R and R^2 in multiple regression?

R is the correlation between a target variable and its linear prediction and indicates strength and direction of the model. R^2 explains the proportion of the variance explained by the model and is referred to as "the coefficient of determination because it tells us how much the outcome variables variation is determined by the predictors.

400

Homoscedasticity is

when there are relatively equal variances in our data (meaning data doesn't hover unequaly around certain portions of our model). This unequal variance is known as Heteroscedasticity.

400

What is the type of testing done to determine significance in mediation?

We need the regression values for the a and b arm as well as their standard errors to conduct a sobel test, this provides our P value and overall SS for the mediation.

500

How do we diffentiate parameters and statistics? How do we also differentiate inferntial from descriptive statistics?

Parameter speak to the population (usually uknonw) where as a statistic typically a drawn sample. Descriptives provide population data through numerical calucalations graphs and tables, infernetial are predictions about a population or sample taken from the data.

500

What are some complaints in Kline (2004) about NHST?

P <.05 doesn’t indicate effect. | P > .05 doesn’t indicate no effect. | Nil or 0 effect may not always be meaningful | The p-value doesn't prove the data is true; it estimates how likely the result is under random conditions (the null hypothesis). | Emphasis on dichotomous thinking | Neglects CI and Res.

500

In multi regression what is our significance variable, what ways can we t-test, and how do we assert effect size?

F is the overall test used for significance, we can only t-test our coefficients. Effect size is our R^2, or we can get effect sizes from our coefficients T scores as well.

500

Explain partialling and partial slopes.

Partialling is seperating the predicted values and their residuals (or whats left out between actual and expected scores). Partial slopes are our b values controlling for the other predictors, showing their contribution to the model.

500

What is the best way to describe the difference between mediation and moderation.

Mediation explains how or why one variable affects another — it adds a middle step in the relationship.
Moderation explains when or for whom one variable affects another — it changes the strength or direction of the relationship. Mediation is about the pathway; moderation is about the conditions.