Basic Probability Concepts
Probability Distributions
Statistical Methodology
Statistical Inference
Potpourri
100
It is the science concerned with “the collection, organization, analysis, interpretation, and presentation of data.”
What is Statistics?
100
It is a numerical description of the outcome of an experiment. It is a function that assigns a numerical value to every possible outcome in a sample space.
What is a random variable?
100
TRUE OR FALSE: Descriptive Statistics is the process of drawing conclusions about unknown characteristics of a population from which data were taken.
What is FALSE?
100
The Central Theorem Limit (CLT) is associated with which probability distribution.
What is the standard normal distribution (Z)?
100
___________ is a hypothesis-testing methodology for drawing conclusions about equality of means of multiple populations.
What is an Analysis of Variance or ANOVA?
200
FILL IN THE BLANK: __________ is a process that results in some outcome. ____________ is the likelihood that an outcome occurs.
What are experiment and probability?
200
Which probability distribution will I use to calculate the probability of having 3 buyers or more for a product out of a random sample of 10 people, knowing that 40% of the population are buyers of this product?
What is the binomial distribution?
200
TRUE OR FLASE: The mean is the middle value (or 50th percentile) when the data are arranged from smallest to largest.
What is FALSE?
200
A ____________________ is the distribution of a statistic for all possible samples of a fixed size.
What is a sampling distribution?
200
__________ is a measure of a linear relationship between two variables, X and Y, and is measured with a statistics that will range from −1 to +1.
What is a Correlation?
300
FILL IN THE BLANK: An __________ is a collection of one or more outcomes from a sample space. Two events are ______________________ if they have no outcomes in common.
What are event and mutually exclusive?
300
Which probability distribution will I use to calculate the probability of having 3 customers or more arriving at the bank within the next 15 minutes knowing that on average there is 2 customers that arrive at the bank every 15 minutes?
What is the Poisson distribution?
300
Sampling is the basis for statistical applications. Name at least three of the most common sampling schemes.
What are a) Simple Random Sampling, b) Stratified Sampling, c) Systematic Sampling, d) Cluster Sampling, e) Judgment Sampling?
300
Hypothesis testing involves drawing inferences about two contrasting propositions (hypotheses) relating to the value of a population parameter, one of which is assumed to be true in the absence of contradictory data and the other which must be true if the null hypothesis is rejected. Name these two contrasting propositions.
What are 1) the null hypothesis, 2) the alternative hypothesis?
300
A regression analysis that involves a single independent variable is called ___________ . A regression analysis that involves several independent variables is called ____________.
What is 1) simple regression, 2) multiple regression?
400
For calculating probabilities, If two events A and B are not mutually exclusive, then P(A or B) = _________________ .
What is P(A) + P(B) – P(A and B)?
400
A curve that characterizes outcomes of a continuous random variable is called a __________________, and is described by a mathematical function f(x).
What is a probability density function?
400
__________ refers to the peakness or flatness of a histogram, ___________ describes the lack of symmetry of data.
What are kurtosis and skewness?
400
A sample of Alzheimer's patients are tested to assess the amount of time in stage IV sleep. It has been hypothesized that individuals suffering from Alzheimer's Disease may spend less time per night in the deeper stages of sleep. Number of minutes spent is Stage IV sleep is recorded for sixty-one patients. The sample produced a mean of 48 minutes (S=14 minutes) of stage IV sleep over a 24 hour period of time. Compute a 95 percent confidence interval for this data. What does this information tell you about a particular individual's (an Alzheimer's patient) stage IV sleep? The standard error of the mean is 1.807392228. t=2.000 Confidence Interval at 95 percent: 43.5 < population mean < 51.6
What is We are 95 percent sure that the population mean for the number of hours an Alzheimer's patient will spend in stage IV sleep in a 24 period of time is somewhere between 44.4 minutes and 51.6 minutes. There is a 5 percent chance than the population mean for stage IV sleep in Alzheimer's patients is less than 44.4 minutes or more than 51.6 minutes.
400
An alternative approach to comparing a test statistics to a critical value in hypothesis testing is to compare the level of significance (alpha) to this probability.
What is a p-value or observed significance level?
500
Two events A and B are independent if P(A | B) = _________.
What is P(A)?
500
The Poisson distribution is related to which continuous probability distribution.
What is the exponential distribution?
500
Name two techniques used in predictive statistics.
What are 1) correlation analysis and 2) regression analysis?
500
Name the steps of an hypothesis test.
What are: 1. Formulate the hypotheses to test. 2. Select a level of significance. 3. Determine a decision rule on which to base a conclusion. 4. Collect data and calculate a test statistic. 5. Apply the decision rule to the test statistic and draw a conclusion.
500
f(x) = (e^(-lambda) x lambda^-x) / x! Name this probability distribution.
What is the Poisson distribution?
M
e
n
u