Previous Midterms
Sampling Distribution
Confidence Intervals and Approximation
Hypothesis Testing
True-False
100

Let Y be the number of books read per month by an individual. Treat Y as a continuous random variable. Assume Y ~ N(6, 2). 

What is P(Y < 8)?

a. Approximately 2.5%

b. Approximately 16%

c. Approximately 84%

d. Approximately 97.5%



c) 

1 SD deviation away in positive side = 34%. 50+34= 84%



100

ECO 329 students have the option to come to this SI session or not. 100 students in Ji’s class were informed of the session, with a p chance of them coming to the session. 150 students in Abrevaya’s class were informed of the session too, and there was the same p chance of coming to the session. Assume the two class sections are independent of one another.


Which of the following distributions provides an approximation to the random variable describing the total number (out of 250) students that came to the SI Session? 


  1. N(100*p + 150*p, 100*p*(1-p) + 150*p*(1-p)) = 

    N(250p, 250p(1-p))

100

Without calculating, which value do you think is bigger t 4,0.05 or t 10,0.05? 



Without performing calculations, we can determine that the value  t_{4,0.05  is larger than t_(10,0.05). This inference is based on the properties of the t-distribution.


The t-distribution becomes closer to the normal distribution as the degrees of freedom increase. In other words, as the sample size grows, the tails of the t-distribution become less pronounced, and the critical values (like those used for confidence intervals or hypothesis testing) decrease.


Since  t_{4,0.05} refers to a t-distribution with 4 degrees of freedom and  t_{10,0.05} refers to a t-distribution with 10 degrees of freedom, the t_{4,0.05}  value will be larger. This is because with fewer degrees of freedom (4 vs. 10), the t-distribution is wider and has heavier tails, leading to larger critical values for a given significance level.



100

Assume that height is normally distributed in the population (represented by the random variable H~N(μ,σ2)). You are interested in estimating the population mean of height.  You gather an i.i.d. sample of 14 individuals, where the sample average of IQ scores is 64 inches and the sample standard deviation is 4. 

Based upon your sample, what is the rejection rule that would be used to test the null hypothesis H:μ=65 at a 20% level? You do not need to solve, simply write out the rule.




Reject if | (64-65) / (4/sqrt(14)) | > t13,0.1


This is equivalent to


-t13,0.1 < (64-65) / (4/sqrt(14)) < t13,0.1


100

True or False:  If X and Y are two random variables with finite variances,  then V(2X - 3Y) must be equal to 4V(X) + 9 V(Y).

False. V(aX - bY) = a^2 * V(X) + b^2 * V(Y) - 2*a*b*COV(X,Y). We cannot assume COV(X,Y) = 0 in this case because we are not told that X and Y are independent RVs. Thus V(2X-3Y) = 4V(X) + 9V(Y) - 12*COV(XY)

200
  • Let Y be the number of books read per month by an individual. Treat Y as a continuous random variable. Assume Y ~ N(6, 2).For what values of c and d is the random variable c+dY distributed as a standard normal variable N(0,1)?

  • c = -6, d = 1/2


- Use Standardizing Formula 


200

uppose we have 400 students taking the ECO 329 final. The time it takes for all students to complete the exam is given by i.i.d. draws T1, T2, T3, …T400 from the T ∼ N(105,7^2) distribution. Let TBar denote the average of the 400 students.


  1. What is E(TBar)?

  2. Input the correct sign: >, <, or = below.

P(104 ≤ TBar ≤ 106) __ P(104 ≤ >T1 ≤ 106)



  1. 105. This reflects the population average

  2. P(104 ≤ TBar ≤ 106) > P(104 ≤ >T1 ≤ 106). Because TBar accounts for a greater sample size, there is a higher probability that it will fall within the bounds of the population mean

200

Suppose you observe a random sample of 20 households and note the sleeping habits.  The sample average of  hours per day is 8  and a sample standard deviation of 3 hours per day. What is the 90% confidence interval using t stat for the population average?

8 +/- (t 19, 0.05) (3/ √20)

200

Suppose that a researcher is interested in the poverty rate in Austin. She obtains a random sample of 350 individuals and finds that the proportion of the sample living in poverty is .105. She decides to test the null hypothesis that the poverty rate is .10 against the alternative that it is greater than .10 using a significance level of .05. Assume that the sample size is sufficient to use the Central Limit Theorem if needed. If Phat is the proportion of the sample living in poverty, which of the following statements are true? More than one statement is true.


Note: The z-score(0.05/2) = 1.96.


  1. The null hypothesis should be rejected.

  2. The null hypothesis should not be rejected.

  3. Reject the null hypothesis for all values of Phat < .05

  4. Reject the null hypothesis for all values of Phat < .07

  5. Reject the null hypothesis for all values of Phat > .16

  6. Reject the null hypothesis for all values of Phat > .18



The following are correct:

  • The null hypothesis should not be rejected.

  • Reject the null hypothesis for all values of Phat > .16

  • Reject the null hypothesis for all values of Phat > .18


200

Answer both of these correctly:

  • True or False: If X and Y are independent random variables, E(X|Y) = E(X).

  • True or False: If A and B are two mutually exclusive events, both occurring with nonzero probability, it must be true that P(A | B)  = P(A

  • True. E(X|Y) = E(X union Y)/E(Y). Because X and Y are independent, X union Y = E(X)*E(Y). Thus, E(X|Y) = E(X)*E(Y)/E(Y) = E(X).

  • False. Mutually exclusive does not mean the same thing as independence. Two events could be mutually exclusive of each other but affect each other (it raining or not could affect your happiness levels despite them being mutually exclusive)

300

Maria, a recent business graduate, is applying for jobs in various tech companies. She applies to 40 different companies. If the probability of receiving a job offer from any given company is 65% and the companies' responses can be considered independent, the random variable associated with Maria’s total number of job offers, denoted by 

X, is a Binomial(40, 0.65).

Which of the following probabilities is largest?

a. P(3≤X≤5)

b. P(7≤X≤9)

c. P(35≤X≤40)


  • c. P(35≤X≤40)

In this case, the distribution becomes more symmetric or even slightly skewed to the left, as P is now much larger. The expected number of successes is np=40×0.65=26. Therefore, option c, might be the largest because it includes values closest to this expected number of successes.

300

You collect a random sample of 100 firms (n=100) and find that 20 of the 100 firms were

awarded a patent during the year. What is the standard error associated with your estimate (0.2)

of the true probability p? (You should be able to simplify your expression to decimal form.)


If 300 additional observations are added to your random sample (yielding 400 total observations)

and you still observe 20% of firms awarded patents during the year, how does the new standard

error of the estimate of 𝜋 compare to the original (n=100) one?

sqrt(0.2*0.8/100) = 0.04 


New standard error is half the size of the original one

300

A survey of 120 students at School X finds an average science score of 75 with a standard deviation of 10. Another survey of 150 students at School Y shows an average of 70 and a standard deviation of 12. Determine the 95% confidence interval for the difference between the population averages at School X and Y.



(75-70) +/- (1.96) (√ 10^2/120 + 12^2/150)

300

You and a friend want to know the average price of student laptops bought from the campus computer score. You each conduct your own research by collecting i.i.d observations from students that have made purchases. You both retrieve a sample of 20 (different) observations. However, your sample average is 550 and your friend’s is 600. But, you both find the same sample standard deviation of 50. If you and your classmate both conducted t-tests of the null hypothesis H:μ=500, whose p-value is greaterr, or are they equal to each othe?

Your friend’s p-value will be smaller.

p-value = P(|Z| > |t-statistic|), where Z ∼ N(0, 1)


Therefore, this probability is determined by the t-statistic.

Because your friend’s sample average is farther from the null hypothesis μ=500 (and the sample standard deviation is the same), their t-statistic will be greater. Thus, this leaves a lower probability for |Z| > |t-statistic| and thus your friend’s p-value will be smaller.



300

True or False: If the p-value of a test is .01 and desired level of significance is .05, the null hypothesis will be rejected

True. When the p-value of a test is less than the desired level of significance, we have evidence to reject the null hypothesis

400

Maria, a recent business graduate, is applying for jobs in various tech companies. She applies to 40 different companies. If the probability of receiving a job offer from any given company is 15% and the companies' responses can be considered independent, the random variable associated with Maria’s total number of job offers, denoted by 

X, is a Binomial(40, 0.15).





400

The filling machine at a bottling plant is operating correctly when the variance of the fill amount is equal to 0.3 ounces. Assume that the fill amounts follow a normal distribution.  


P(?) = P(1/29 * Σ(Xi-x̄)2 > 0.5)

  • = P( Σ(Xi-x̄)2 > 0.5*29)

  • = P ( Σ(Xi-x̄)2/(.3) > 0.5*29/0.3))

= P(Z>48.33) where Z ~ X292

This is the probability that for a sample of 30 bottles the sample variance is less than 0.5

400

You have a random sample of 400 companies (n=400). The sample mean of 

Y is given as 2 million dollars, and the sample standard deviation is unknown. What must be true about the sample standard deviation for the width of the 95% confidence interval for the average grant amount (Y) to be less than 0.5 million dollars? Your answer should be in the form of a condition like "Standard Deviation of Y> [Value]" or "Standard Deviation of 

Y< [Value]"?

Width of the CI = 2(1.96) (SD/√400) < 0.05

1.96 -> Z score 95% Interval

SD < (0.5*20)/2(1.96) 



400

You’re once again investigating if height is normally distributed, but this time you combine random samples with other researchers to get a total of 2000 observations. The standard error of the sample average is now equal to 0.5. 


Using the z-test, do you reject the null hypothesis that H:μ=65 at the 5% level? What about the 10% level?




Reject at both levels. 


Z-statistic = | (64-65) / 0.5 |. Note that the standard error is the population standard deviation divided by the square root of the sample.


For 5%: Reject if | (64-65) / 0.5 | > z0.025 


For 10%: Reject if | (64-65) / 0.5 | > z0.05


z0.025 = 1.645

z0.05 = 1.96

 64-65/0.5 = -2


400


True or False: You want to find the 95% (asymptotic) confidence interval for the population bottom 25% quantile of ECO 329 grades. To obtain this interval, all you need is the sample bottom 25% quantile and the t0.025 critical value.

If true, why?

If false, what is missing?



False. You would need the standard error and the z0.025 critical value instead of the t0.025 critical value because this is using information from the population where the population parameters would be provided.

M
e
n
u