Intro to Stats
Measures of Centrality & Variability & Freq Dist
Probability & Sampling
z scores & the normal curve
Hypothesis Testing
100

Why is a sample and why are they more commonly studied than populations?


It is a subset of data from the overall population. It is usually impossible to study an entire population (we don't usually have access, time, or money)

100

When creating histograms, the x-axis typically represents _____, while the y-axis represents _____

The bar height is...


values or intervals; frequencies

The frequecy of each score/interval is represented by the height of the bars.

100

What is the difference between random assignment and random sampling: when do we use each and what do they do to improve our research? 


random assignment deals with how participants in your sample are assigned to levels of the independent variable, whereas random selection deals with how you choose participants for your study.

RA --> controls extraneous variables

RS--> generalizability,  low sampling error

100

Imagine that a population of scores is normally distributed. If a statistician draws samples from the population, which of the sampling distributions is more likely to be normal in shape?

one with a sample size of 5 |  20 | 100 | All are equally likely to be normally distributed since the population is normally distributed.


 

sample of 100 b/c a distribution of scores from a normally distributed population is more likely to approximate to a normal curve if:


100

What is the difference between the null and research hypotheses?

The research hypothesis is a statement that usually assumes that there is a difference between populations. 

The null hypothesis is a statement that usually assumes that there is NO difference between populations or the difference is in the opposite direction of what we would expect.


200

What is the difference between descriptive and inferential statistics and what is an example of each?  

Inferential statistics allow researchers to draw conclusions about populations, whereas descriptive statistics simply organize and summarize data.

Descriptive: mean, median, mode, SD if only describing a sample

Inferential: t-statistic

200

Data that have not been transformed or analyzed are referred to as _____ scores.


raw scores

200

The extent to which researchers are able to apply findings in one situation to other situations is called 


external validity or the generalizability of research  

200

What does it mean to say that the normal curve is symmetric?

The normal curve is very important in inferential statistics because:


Exactly 50% of scores fall below the mean and 50% fall above the mean.

the normal curve may be translated into percentages, which make direct comparisons of scores on different measures.

200

In a study designed to investigate the link between sugar and hyperactivity, a researcher assigns one group of children to eat a high-sugar snack and another group of children to eat a snack with no sugar. What is the research hypothesis for this study?

BONUS: What are the IV and DV?

BONUS 2: Which are the assumptions of parametric hypothesis tests? (3 of them). Under what conditions is it permissible to proceed if assumptions are violated?


There is a difference in the hyperactivity levels of the two groups.



The population is normally distributed. Participants are selected randomly from the population. The dependent variable is measured on a scale measure. 

(ok if the data are not clearly nominal or ordinal)



300

What are the differences between: interval, ratio, nominal, ordinal variables

BONUS: how do discrete vs continuous factor in?

interval: numbers are evenly spaced

ratio: 0 is an absence of the thing

Both interval and ratio are scale observations.

nominal: categories or names of things

ordinal variables: have a rank order

300

What is the difference between positive and negative skew and a normal distribution?

In a distribution that is negatively skewed, most of the scores occur to the right of the distribution. The tail points to the left. AND VICE VERSA.

A normal dis is symmetrical, unimodal. 


300

Why are true random samples rarely used? What is used instead?


A researcher rarely has access to the entire population.

Convenience samples.

300

To what percentile does a z score of −1.0 roughly correspond? What abouta a z score of + 1.00?


16th (50-34)

 84th (34+50)

300

1) In hypothesis testing, which hypothesis are we testing? 

2) The phrase statistically significant means that the research:

3) Define: The alpha level, | critical region 

A researcher's decision regarding whether to reject the null hypothesis is based on the probability that one would see the observed group differences if there was no effect on the independent variable

result was unlikely to have occurred by chance and can reject the null.

3) The alpha level is the: probability used to determine the critical values.

The null hypothesis is rejected when the test statistic falls in the crit region.

The probability that researchers usually adopt to determine whether a result is extreme is .05





400

OPTIONS:

1) What is the difference between reliability and validity?

2) What is the difference between a predictor vs. outcome variables (and independent vs. dependent)

1) 

A reliable measure is consistent. A valid one measures what is it supposed to measure.

2) A predictor variable is manipulated or observed in order to determine its effect on the outcome/ dependent variable. We use IV/DV when talking about an experiment.

400

1) When should we use each measure of centrality?

2) what are the symbols for mean in the population vs. sample and for standard deviation?  

If our data are nominal, we should use the mode as our measure of central tendency.

If data are normal = mean

If data are skewed = median

paramters --> mu and sigma; stats --> M & SD

400

What is the law of large numbers?

If someone flips a standard coin, the probability of it landing on heads is the same as it landing on tails (e.g., 0.50). However, it is more likely to achieve this outcome the more often the coin is flipped. That is, over a large number of trials it is more likely to approximate the expected probability. 

400

The z statistic indicates:

In words, the standard error is defined as:


the number of standard errors a sample mean is from the population mean.

the population standard deviation divided by the square root of the sample size.


400

A(n) _____ error is considered to be worse than a(n) _____ error because it is more likely to lead people to action.

explain each type.


Type I: correct null hypothesis is rejected; false positive

Type II: you fail to reject the null hypothesis when the null hypothesis is false; false negative

500

What is a confounding variable and how does random assignment help mitigate the risk of them?

A variable that changes along with the predictor so that its effects cannot be distinguished from the predictor variable's effects.

Random assignment distributes extraneous variables across groups.



500

A researcher measures the amount of food consumed by each dog in their lab. They find that the mean amount is 14 oz. The sum of squared deviations is 250. What is the variance? SD? Why is SD preferred?

The standard deviation (SD) is most commonly used to get a sense of how far the typical score of a distribution differs from the mean. In computing the SD, why is it necessary to square the deviations from the mean for each score?

25 & 5

The mean of the deviations balances out to zero due to negative and positive values.

500

Define illusory correlation and confirmation bias.

An illusory Correlation is when one perceives an association between variables where none exists.

When a person pays attention to evidence that confirms what they already believe and ignores evidence that would disconfirm their beliefs, they are demonstrating _____ .

500

How do the mean and spread of a sampling distribution differ from the population of scores?

As sample size increases how do they change?

The mean of the sampling distribution:is the same as the population mean of the individual scores.

The standard deviation (standard error) is narrower

As sample size increases, the mean of the sampling distribution is unchanged but the SE decreases.


500

What is a distribtuon of means?

What is the central limit theorem?

A distribution composed of many means that are calculated from all possible samples of a given size, all taken from the same sample population, is called a:

The fact that a distribution of sample means is more normally distributed than a distribution of individual scores, even when the original population is not normal, is a principle demonstrated by the:


M
e
n
u