Ch 1: Stats Starts Here
Ch 2: Displaying & Describing Categorical Data
Ch 3: Displaying & Summarizing Quantitative Data
Ch 4: Undersating & Comparing Distributions
Ch 5: The Standard Deviation as a Ruler & the Normal Model
100

Information about an individual in a database.

What is a record.

100

A _______ ___ ______ displays the conditional distribution of a categorical variable within each category of another variable.  

What is a Segmented Bar Chart.

100

Regions of your data that have no values.

What is a Gap.

100

A _______ displays data that change over time to show long-term patterns and trends.

DAILY DOUBLE!!!!!

Timeplot

100

A value found by subtracting the mean and dividing by the standard deviation. 

What is a Standardized Value.

200

Someone who answers, or responds to, a survey.

What is a respondent.

200

In a statistical display, each data value should be represented by the same amount of area.

What is the Area Principle.

200

A hump or local high point in the shape of the distribution of a variable. The apparent location of ______ can change as the scale of a histogram is changed.

What is Mode.

200

The place in the distribution of a variable that you’d point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number. Measures of ______ include the mean and median.

What is the Center.

200

In a scatterplot, you must choose a role for each variable. Assign to the y-axis the ______ variable that you hope to predict or explain. Assign to the x-axis the ________ or predictor variable that accounts for, explains, predicts, or is otherwise responsible for the y-variable.

What is the Response Variable, Explanatory Variable, x-variable, and y-variable.

300

A ________ holds information about the same characteristic for many cases. (What).

What is a Variable.

300

The methods in this chapter are appropriate for displaying and describing categorical data. 

Be careful not to use them with quantitative data.


What is the Categorical Data Condition.

300

The ___ __ _ _______ are the parts that typically trail off on either side. Distributions can be characterized as having long tails (if they straggle off for some distance) or short tails (if they don’t).

DAILY DOUBLE!!!!!




What is the Tails of a Distribution.

300

________ are extreme values that don’t appear to belong with the rest of the data. They may be unusual values that deserve further investigation, or they may be just mistakes; there’s no obvious way to tell. Don’t delete ______ automatically—you have to think about them. _______ can affect many statistical analyses, so you should always be alert for them.

 What are Outliers.

300

A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.

What is a Lurking Variable.

400

The cases we actually examine in seeking to understand the much larger population.

What is the Sample of a Population.

400

The _________ of a variable gives

■ the possible values of the variable and

■ the relative frequency of each value.

What is the Distribution of a Variable.

400

A numerical summary of how tightly the values are clustered around the center. Measures of ______ include the IQR and standard deviation. 

What is Spread.

400

Applying a simple function (such as a logarithm or square root) to the data can make a skewed distribution more symmetric or equalize spread across groups.

What is Re-expressing or Transforming.

400

■ Direction: A positive direction or association means that, in general, as one variable increases, so does the other. When increases in one variable generally correspond to decreases in the other, the association is negative.

■ Form: The form we care about most is straight, but you should certainly describe other patterns you see in scatterplots.

■  Strength: A scatterplot is said to show a strong association if there is little scatter around the underlying relationship

What is Association.

500

The ________ ideally tells Who was measured, What was measured, How the data were collected, Where the data were collected, and When and Why the study was performed. 

What is the Context.


500

When averages are taken across different groups, they can appear to contradict the over-all averages. This is known as _______ ________. 

What is Simpson's Paradox.

500

A distribution is _______ if it’s not symmetric and one tail stretches out farther than the other. Distributions are said to be skewed left when the longer tail stretches to the left, and skewed right when it goes to the right.

What is Skewed.

500

When comparing groups with _______:

■ Compare the shapes. Do the boxes look symmetric or skewed? Are there differences between groups?

■ Compare the medians. Which group has the higher center? Is there any pattern to the medians?

■ Compare the IQRs. Which group is more spread out? Is there any pattern to how the IQRs change?

■ Using the IQRs as a background measure of variation, do the medians seem to be different, or do they just vary much as you’d expect from the overall variation?

■ Check for possible outliers. Identify them if you can and discuss why they might be unusual. Of course, correct them if you find that they are errors.


What is Comparing Boxplots.

500

The _________ __________ is a numerical measure of the direction and strength of a linear association.

What the Correlation Coefficient.

Write it out on the board. Explain each variable.