Why doesn’t the median make sense with nominal scales?
There is no order with nominal scales.
When do we lose a degree of freedom?
Whenever we are estimating something OR when we use sample data to estimate something about a population parameter.
Two cards are dealt. What is the probability that the first is an ace AND the second is a king?
Key word: AND. Indicating we are using the multiplication rule formula.
Event A: 1st Ace (4/52)
Event B: 2nd King (4/51)
P(A and B) = P(A) * P(B|A)
The mean of a test is 80 and the standard deviation is 25. It is normally distributed. Max got a z-score of -0.60. (That’s negative 0.6) What is his test score?
X=Z*SD+Mean
X=-0.6*25+80 = 65
True or False: The Poisson Distribution is better to use if p=0.5 than if p=0.000005. Explain.
FALSE! The Poisson distribution is better used when P is very small (indicating it is super rare!)
The mean is often more stable than the median, but not always. When is the median more stable?
When data is strongly skewed or when excessive outliers.exist!
If I took the variance of a list of numbers and multiplied it by how many numbers there are, what would I get?
Sums of Squares!
SS=Var*N
The probability that a new airport will get an award for its design is 0.16, the probability that it will get an award for the efficient use of materials is 0.24, and the probability that it will get both awards is 0.07.
What is the probability that the same airport will get the design award GIVEN that it got the award for the efficient use of materials?
Key word: GIVEN! Indicating we are using our conditional probability equation.
P(A|B)=P(A and B)/P(B)
P(A|B)=0.07/0.24
P(A|B)=0.2914
It is given that the mean SAT score is 500 with a standard deviation of 100. Amy scored in the 90th percentile rank. What is her score?
Mean=500, SD=100
Z-score for 90th percentile: 1.29 to 1.34 (all acceptable)
X=Z*SD+Mean
X=1.30*100+500 = 630
629~634 ALL ACCEPTABLE!
In a certain town, the average person weighs 150 pounds with a standard deviation of 35. If you take a sample of 49 people, what is the probability that you get a sample average of 146 or more?
Mean=150
SD=35
N=49
X=146
SE=SD/Sqrt(N) --> 35/sqrt(49)=5
Z=X-Mean/SE --> 146-150/5 =-0.8 (go to z-table)
146 or more = 0.7881 or 78.81%
Why is it important to know how many variables an experiment has?
The number of variables affects how the data is to be analyzed
Clara asked twenty people what is the best position they have ever finished a marathon in. Please calculate the most complex yet appropriate measure of central tendency for the following data: [0, 0, 1,1, 2, 2, 2, 3, 3, 5, 5, 5, 6, 7, 7, 7, 7,7 1, 13, 16, 18, 100]
[0, 0, 1,1, 2, 2, 2, 3, 3, 5, 5, 5, 6, 7, 7, 7, 7,7 1, 13, 16, 18, 100]
Mean: 10
Median: 5
Mode: 7
To determine the most appropriate measure of central tendency, we MUST first determine what the scale of measurement is for the variable "position finished in marathon." Because races are often ranked, we can say the variable is Ordinal!
Scale of measurement of variable = Ordinal!
If ordinal in nature, we know the mean cannot be helpful! WHY? Because the mean is only meaningful when the data is interval/ratio in scale.
in this case, we must choose between Median and Mode. More often than not, the mode is used for nominal data. So, the Mode doesn't make sense either here, Why? because there is NO ORDER with nominal scales.
Thus, the Median is best when data is ordinal in scale, but also when data is strongly skewed, or there are excessive outliers... (as we see with the value 100!).
Thus the most appropriate measure of central tendency is the median (5)!
A deck of cards is well shuffled. What is the probability that the top card OR the bottom card is a heart?
Key word: OR. Indicating we are using our addition rule.
Event A: Top Heart (13/52)
Event B: Bottom Heart (13/52)
P(A or B)=P(A) + P(B) - P(A and B)
P(A or B)= 13/52 + 13/52 - (13/52 *13/51)
The average height of students in a school is 160cm. with a standard deviation of 10cm. If Zeli’s height is 145cm., what percentile rank does she fall into at her school?
mean=160, SD=10, X=145
Z=X-Mean/SD --> Z=145-160/10 = -1.5
Look for -1.5 on the z-table to determine the percentile Zeli is in. When we look we see that Z=-1.5 corresponds with 0.0668 or the 6th percentile!
There is a pipe-making machine. On any given day, it averages 5 errors for every 10,000 feet of pipe. On Tuesday, they are going to make 4,000 feet of pipe. What is the probability that they will have exactly five errors?
in one day, there are typically 5 errors for every 10,000 feet of pipe. However, our interest is only in making 4,000 feet of pipe on Tuesday.
To obtain the mean, we need to know how often one mistake is made. We can determine this by dividing 10,000/5=2000. This outcome allows us to know that for every 2,000 pipes there is one error. Thus, for the 4,000 pipes to be made on Tuesday, we can expect to make TWO errors.
Mean=2
X=5
using the Poisson Distribution Formula:
E^-2* (2^5/5!) = 0.1353 * 32/120 = 0.036 or 3.6%
A researcher was interested in what variables affected the time that it took people in a marathon to run up the Pikes Peak Highway (a nineteen-mile-long road that goes up to 14,115 feet). A total of 1,136 people ran in the marathon. Our researcher took a sample of 115 of these people. She recorded how long it took them to reach the summit; she asked them the elevation of the city that they live in; she recorded the country they were from; and she also asked them if they exercise a low, a moderate amount, some, or never. The average elevation the 115 lived at was 3,105 feet, although the average for the 1,136 people is 3,569 feet.
Please list the variable. State how many categories each variable has (if more than six, write several) and state the scale of measurement for each variable. Additionally, which value (number) is our parameter?
Variables:
1. time to complete marathon (several categories). Interval
2. Elevation of the city of residence (several categories). Ratio
3. County of residence (several categories). Nominal
4. Frequency of exercise (4 categories). Ordinal
Explain why the standard deviation, besides being a measure of width, can also be thought of as a measure of error.
When guessing or estimating the mean, the standard deviation tells us the average size of error we will make.
In a deck of 52 playing cards, there are 12 face cards, (4 kings, 4 queens, and 4 jacks). If you draw three cards from the deck (without replacement), what is the probability that all three cards are face cards?
12/52 * 11/51 * 10/50
The mean of a test is 600 and the standard deviation is 200. It is normally distributed. Anthony is in the 55th percentile rank. What is a possible score for him?
mean=600, SD=200
When you see something along the lines of "55th percentile rank" we are going to be looking for a percent that begins with 0.55__ in the large probabilites section of the z-table. in this case the z-scores that correspond with the 55th percentile are Z=+0.13. +0.14, and +0.15 (any of these are acceptable!)
X=Z*SD+Mean
X=+0.13*200+600 = 626
There is a test. The scores are normally distributed with a mean of 90 with a standard deviation of 40. A sample of 25 people is taken. What is the probability that the sample mean will be between 84 and 100? Assume a normal distribution.
Mean=90, SD= 40, N=25
SE=SD/sqrt(N) --> 40/sqrt(25)=8
Z=84-90/8 = -0.75 (0.2266)
Z=100-90/8 = +1.25 (0.8944)
between 84 and 100: 0.8944-0.2266 = .6678 Or 66.78%
Subjects were randomly assigned to one of the three conditions: (a) playing a highly aggressive video game; (b) playing a mildly aggressive video game; (c) playing a non-aggressive video game. Afterward, all the subjects completed a questionnaire that included a measure of how hostile the subject felt at the moment.
Please list the variable. State how many categories each variable has (if more than six, write several) and state the scale of measurement for each variable
Variables:
1. Aggressiveness of the game (3 categories) - Ordinal
2. Hostility (several) -- Ordinal
You have a sample of five scores. Use this data to estimate the standard deviation of the population. The five scores are: 1, 2, 2, 3, 7
X: 1, 2, 2, 3, 7
x-mean (mean=3): -2, -1, -1, 0, +4
(x-mean)^2 = 4, 1, 1, 0, 16
SS=22
SD=sqrt(SS/N-1)
SD=sqrt(22/5-1)
SD=sqrt(5.5)
SD=2.345
Two cards are dealt. What is the probability that the third card is a heart GIVEN that the second is a jack?
Key word: GIVEN! Indicating we are using our conditional probability equation.
Event A: 2nd Jack | Event B: 3rd Heart
P(3rd Heart | 2nd Jack)
4 jacks in a deck (hearts, diamond, clubs, spades)
Knowing the second card is a jack removes exactly one random card from the deck—but since it’s only a heart 1/4 (jack of hearts) of the time, the overall probability of a heart on the third draw stays 1/4.
A die is rolled 180 times. What is the probability of getting forty or more “sixes”?
Probability of rolling a six-sided die =1/6
N=180
X=40
Hm, where do we start? We need to find our expected value (AKA mean) and our Standard Error for a probability.
EV=N*P -->
EV=180*1/6 = 30
SE = Sqrt(N*P(1-P))
SE= sqrt(180*1/6*5/6)
SE=5
Z=X-Mean/SD --> 40-30/5 =2
The area above z=+2 is 0.0228 or 2.28%
A sample distribution is a distribution measuring every single individual in the sample.
A sampling distribution is a distributionof sample means