A means of altering the data to achieve the conditions/structure necessary to utilize particular summaries or models.
What is re-expression?
1. Nobody can guess the outcome in advance.
2. Outcomes are equally likely.
What is it about random selection that makes if seem fair?
The entire group of individuals or instances about whom we hope to learn, but examining all of them is usually impractical, if not impossible.
What is the population?
Any systematic failure of a sampling method to represent its population. It is almost impossible to recover fromm.
What is Bias?
A sample in which each set of n elements in the population has an equal chance of selection.
The standard method of utilizing randomization to make the sample representative of the population of interest.
What is Simple random sample (SRS)?
2. Make the scatter in a scatterplot more consistent (not fan shaped).
3. Make the distribution of a variable (histogram) more symmetric.
4. Make the spread across different groups (boxplots) more similar.
What are several reasons to consider a re-expression?
We know what outcomes could happen, but not which particular values will happen. Outcomes that we cannot predict but that nonetheless have a regular distribution in very many repetitions.
What is a random event/phenomenon?
A (representative) subset of a population, examined in hope of learning about the population is a _______(1).
A study that ask questions of a _(1)_ drawn from some population in the hope of learning something about the entire population (Polls) is a ___________.
What is a sample?
What is a sample survey?
______________ is often the best use of time and resources when sampling or surveying.
What is reducing biases?
The natural tendency of randomly drawn samples to differ from each other.
What is sampling variability?
Orders the effects that the re-expression have on the data. A good starting point is ______. If all else fails try ____________.
What is the Ladder of Powers good for?
What is taking logs.
try whacking the data with two logs (log x and log y).
A sequence of random outcomes that model a situation, often difficult to collect data on and with a mathematical answer hard to calculate.
Models random events by using random number to specify event outcomes with relative frequencies that correspond to the true real-world relative frequencies we are trying to model.
An artificial representation of a random process used to study its long-term properties.
What is a simulation?
This is any summary calculated form the (sampled) data while that are key numbers in mathematical models used to represent reality.
This is a statistic. They are written in Latin letters.
That is a parameter. They are written in Greek letters.
The best defense against bias is ______ (stirring to make sure that on average the sample looks like the rest of the population).
What is Randomization?
The precision of the statistics of a sample depend on _______ not ___________.
What is the sample size (soup spoon)?
What is its fraction of the larger population?
1. Can't straighten scatterplots that turn around.
2. Can't re-express "-" data values with square root (+constant to shift >0)
3. Minimal affect on data values far from 1-100 (-constant to shift)
4. Can't unify multiple modes.
What are limitations of re-expression?
The most basic situation in a simulation in which something happens at random [random happening] is a _______.
What is a component?
A numerically valued attribute of a model for a population, often unknowable and estimated from sampled data is a _______(1).
Statistics computed from a ______ sample accurately reflect the corresponding _(1)_.
What is a population parameter?
This type of bias occurs when individuals can choose on their own whether to participate in the sample. Always yields invalid samples.
This type of bias occurs when the sample is comprised of individuals readily availab.e Always yields a non-representative sample.What is Voluntary response bias?
What is Convenience bias?
A list of individuals, which clearly defines but may not be representative of the entire population , from which the sample is drawn.
as compared to ...
A sample that consists of the entire population.
What is a sampling frame?
What is a census?
When discussing the accuracy or confidence of the linear regression model be sure to comment on both the appropriateness of _____________ and success of _____________.
What is the appropriateness of the model as indicated by the residual plot
success of the model as indicated by R2
An individual result of a component [result of random happening] is an ______ and the sequence of several components representing events that we are pretending will take place is a _____(2).
The result of each _(2)_ with respect to what we were interested in is the ____________.
(2) What is a trial?
What is the response variable?
This corresponds to, and thus estimates, a population parameter?
What is a sample statistic?
This type of bias is when individuals from a subgroup of the population are selected less often than they should be.
This type of bias is when a large fraction of those sampled will not or cannot respond.
This type of bias is when respondents' answers might be affected by survey design, such as question wording or interviewer behavior.What is Undercoverage bias?
What is Nonresponse bias?
What is Response bias?
C_________: These samples randomly select among heterogeneous subgroups that each resemble the population at large, making our sampling tasks more manageable.
Sy________: These samples can work, when there is no relationship between the order of the sampling frame and the variables of interest, and are often the least expensive method of sampling. But we still want to start them randomly.
What is Stratified samples?
What is Cluster samples?
What is Systematic samples?