What is the difference between the sample and the population? What about the parameter and a statistic?
The population is the set of all elements of interest in a study, a sample is a subset of the population.
Parameter is a numerical measurement describing some characteristic of a population.
Statistic is a numerical measurement describing some characteristic of a sample.
Does correlation imply causation? What does this mean?
NO!!!!
Mean, median, and mode are all measures of _______. Range, standard deviation, and variance are measures of ______. Z- scores are measures of _______________.
Center, Variability, relative location
Define: probability, experiment, sample space,
probability: the likelihood that an event will occur
experiment: a process that generates well defined outcomes
sample space: the set of all outcomes
What is a random variable?
A numerical description of an experiment
What is descriptive and inferential statistics?
Descriptive statistics: data are summarized and presented in a form that is easy for a reader to understand (graph, table, numerical)
Statistical Inference: Collecting data from a smaller group when collecting data from the larger population is impractical.
What is frequency distribution? When might we use it?
A tabular summary of the data showing the number of items in each category/class.
Find the mean, median, and mode for the following set of data representing the age of students in a college course.
18, 24, 20, 21, 21, 22, 22, 19, 18, 23, 18, 19, 20
Mean=20.38
Median=20
Mode=18
All possible outcomes that are not A (that event)
What is the difference between a discrete random variable and a continuous random variable?
A set of data is said to be continuous if the values belonging to the set can take on ANY value within a finite or infinite interval.
A set of data is said to be discrete if the values belonging to the set are distinct and separate (unconnected values).
What is random sampling and why is it important? Give an example of random sampling being used and random sampling not being used.
random sampling is a sample drawn so that each unit in the population has the same chance of being included in the sample. It is important because it protects the integrity and accuracy of the data collected.
On the board, show what a bar graph, histogram, and stem and leaf display look like.
:)
What is a weighted mean? When might we use the weighted mean?
multiply each x value by the corresponding weight (w), then divide the sum of those values by the sum of the weights.
GPA
The _____ of an event is a probability obtained with the additional information that some other event has already occured.
conditional probability
A measure of central location for a discrete random variable is
the expected value or mean
A magician works develops a coin that will land on heads every single time the coin is flipped. In a test run of the coin where the magician flips the coin 100 times, it lands on heads 56 times. Does the coin have statistical significance? What about practical significance?
No and no
What is relative frequency distribution, and what is cumulative frequency distribution? When might we use cumulative?
Relative: shows the proportion of items belonging to a class
Cumulative: shows the number of data items with values less than or equal to the upper class limit. (ticket prices, number values in ranges)
Give the formulas for variance and standard deviation.
:)
Describe the difference between the union of two events and the intersection of two events
union: belongs to A or B or both
intersection: belongs to A AND B
What is a bernoulli trial?
A trial that only can result in two possible outcomes
Define each of the 4 scales of measurement
Nominal scale: data consists of labels or names where the order of the labels is not meaningful
Ordinal scale: data consists of labels or names where the order of the labels is meaningful
Interval scale: numeric data where the differences between values can be found meaningful (no natural zero starting point)
Ratio scale: the data have the properties of the interval scale and the ratio of 2 values is meaningful (there is a natural zero starting point)
Scatter diagram: a graphical representation for two quantitative variables.
Trendline: a line that approximates that relationship.
We mage a scatter diagram to summarize the association between two numerical values.
Define chebyshev's theorem
a fact that applies to all possible data sets. It describes the minimum proportion of the measurements that lie must within one, two, or more standard deviations of the mean.
Explain the counting rule, combinations, and permutations
counting rule: counting the outcomes in experiments with a large number of outcomes
combinations: the number of combinations for selecting x elements from n distinct items
permutations: used to find the number of arrangements of selecting x items out of n distinct items
Describe the rule of thumb
The vast majority of values should lie within two standard deviations of the mean