STA258 Lecture 08
Review
Today we're going to see large sample Confidence Intervals .
Z Confidence Intervals and T as well.
#tk Large sample Confidence Intervals are on the test.
Large Sample Confidence Intervals
Z = θ ^ − θ σ θ ^ ∼ N ( 0 , 1 )
We know σ θ ^ = Var ( θ ^ )
If θ ^ = X ¯
Then Var ( X ¯ ) = σ 2 n = σ n
So we have X ¯ − μ σ n
Suppose we have P ( a ≤ Z ≤ b ) = 1 − α
What should our a , b be?
Suppose we have a normal.
Our area is 1 − α
So our tails are α 2
b = Z α 2
Since we have symmetry
a = Z α 2
P ( − Z α 2 ≤ θ ^ − θ σ θ ^ ≤ Z α 2 ) = 1 − α
We need to isolate θ
P ( − σ θ ^ Z α 2 ≤ θ ^ − θ ≤ σ θ ^ Z α 2 )
P ( θ ^ − Z α 2 σ θ ^ ≤ θ ≤ σ θ ^ Z α 2 + θ ^ )
So our Confidence Intervals
[ θ ^ − Z α 2 σ θ ^ , θ ^ + Z α 2 σ θ ^ ] = θ ^ ± Z α 2 σ θ ^
So our CI for θ is the point estimator ± our cutoff times standard error.
Example:
We have the mean annual household income of a set if 119155
Assume this is based on a sample of 80 households.
σ = 30000
We have Y 1 , … , Y n ∼ i i d N ( μ , σ 2 )
θ ^ = X ¯
Then SE [ X ¯ ] = σ n
Then CI for μ :
Compute margin of error.
= Z α 2 ⋅ σ n
We want a 90 CI
1 − α = 0.9
α = 0.1
α 2 = 0.05
Look at 5 % on the table
We get 1.64 ⋅ σ n
1.64 ⋅ 30000 80 = 5500.72722464948
CI is
X ¯ ± Z α 2 σ n
119155 ± 5500.7272
119155 + 5500.7272 = 124655.7272
119155 − 5500.7272 = 113654.2728
If we increase the CI. What will happen?
Likely we have a larger interval.
1 − α = 0.95
α = 0.05
0.05 2 = 0.025
Look at point on table = 1.96
CI
X ¯ ± Z α 2 σ n
119155 ± Z α 2 σ n
119155 ± ( 1.96 ) 30000 80
119155 ± ( 1.96 ) 30000 80
( 1.96 ) 30000 80 = 6574.03985384938
See we have a larger interval. Which increases our confidence interval.
#tk Can we optimize the tradeoff. Get the most confidence with least interval.
Analyzing pharma
We're looking for some n to achieve a CI with 0.95 confidence
1 − α = 0.95
α = 0.05
0.05 2 = 0.025
m = 0.005 as our margin of error
We get Z α 2 = Z 0.025 = 1.96
σ = 0.0068
How don't we know the mean but we know the σ
From historical data we can estimate σ
Margin of error
Z α 2 σ n
m = Z α 2 σ n
n = Z α 2 σ m 2
n = ( ( 1.96 ) ( 0.0068 ) ( 0.005 ) ) 2 = 7.10542336
So n = 8
Is it realistic to assume that the population follows a normal?
Usually we need that n = 30 for Central Limit Theorem .
There's a danger that our sampling dist is not normal then CLT doesn't hold.
Example:
We have m = 2
and 22.5 as the value for σ
Find the Sample Size Recommended to Estimate Mean.
1 − α = 0.9
α = 0.1
α 2 = 0.05
Z α 2 = 1.65
n = ( Z α 2 σ m ) 2
n = ( ( 1.65 ) ( 22.5 ) ( 2 ) ) 2 = 344.56640625
n = 345
1 − α = 0.95
α = 0.05
α 2 = 0.025
Z α 2 = 1.96
n = ( ( 1.96 ) ( 22.5 ) ( 2 ) ) 2 = 486.2025
n = 487
#tk that's it for test 1.
Confidence Intervals based on the t dist
What if σ is not known.
Estimate it by S
T = X ¯ − μ S n ∼ t ( n − 1 )
P ( | T | < m ) = 1 − α
Find two end points of the interval.
Our area is 1 − α our points are ± α 2
P ( − t ( α 2 ) ≤ T ≤ t ( α 2 ) ) = 1 − α
Always isolate the unknown parameter.
P ( − t ( α 2 ) ≤ X ¯ − μ S n ≤ t ( α 2 ) ) = 1 − α
P ( X ¯ − t ( α 2 ) S n ≤ μ ≤ X ¯ + t ( α 2 ) S n )
C I = X ¯ ± t ( α 2 ) S n
The lower and upper limit are Random Variables .
Prior to observation, the CI is random.
After an observation, then the CI is deterministic.
CI is not the most accurate interpretation that our parameter is in there.
If we have 90 percent confidence, 90 out of 100 trials we will have our Estimator correctly Estimate the parameter.
Normal Population Assumption
For small samples of n < 15 . The data should follow a normal dist. If you see outliers or skewness, be cautious.
For moderate samples of 15 ≤ n ≤ 40 . The data should not show strong skewness or outliers. Make a histogram, boxplot or Q-Q plot to check.
For large samples of n > 40 . The t procedure is fairly robust to non-normality. Unless the data are extremely skewed or contain outliers. Make a histogram, boxplot or Q-Q plot to check.
The reason, as n ↑ , the t dist approaches the normal dist.
Example:
Ancient air.
We can examine gas inside ancient amber.
Will give us sample of time when amber was formed.
We have these observations:
n = 9
63.4 65 64.4 63.3 54.8 54.5 50.8 49.2 51.0
Find a 90 % CI for the mean nitrogen level.
Our present atmosphere is $78.% $ nitrogen
X ¯ = 59.58
S 2 = ∑ ( X i − X ¯ ) 2 n − 1 = 6.2552
1 − α = 0.9
α = 0.1
α 2 = 0.05
t ( α 2 ) = t 0.05 ; 8 = 1.860
P ( X ¯ − t ( α 2 ) S n ≤ μ ≤ X ¯ + t ( α 2 ) S n ) = 1 − α
P ( 59.58 − ( 1.860 ) 6.2552 9 ≤ μ ≤ 59.58 + ( 1.860 ) 6.2552 9 ) = 0.9
C I = 59.58 ± ( 1.860 ) 6.2552 9
= 59.58 ± 3.872
= ( 55.708 , 63.452 )
Example:
A film-processing company want to know how many pictures were stored on computers.
Random sample of 10 digital camera owners.
Estimate with 95 % confidence the mean number of pictures stored.
Data:
n = 10
X ¯ = 17.7
S = 9.08
1 − α = 0.95
α = 0.05
α 2 = 0.025
t ( α 2 ) = t 0.025 ; 9 = 6.49
Margin of error:
= t ( 0.025 ; 9 ) S n
2.262 ( 9.08 10 ) = 6.49498943710919
CI:
Assumption of normality is critical here since n is small.
#tk learn q-q-plots Normal Q-Q Plot .
Example:
Suppose we have a random sample:
Y 1 , … , Y n ∼ i i d Bernoulli ( p )
How can we estimate p ?
Flip a coin 100 times, how can you estimate p ?
It's the number of successes over number of trials.
S = ∑ i = 1 n X i ∼ Bin ( n , p )
S is total number of successes.
p ^ = S n
p ^ is our proportion of successes. An estimator for p .
We need to standardize.
E [ p ^ ] = E [ S n ] = 1 n E [ S ] = 1 n n p = p
Var ( p ^ ) = Var ( S n ) = 1 n 2 Var ( S ) = 1 n 2 n p ( 1 − p ) = p ( 1 − p ) n
SE [ p ^ ] = Var ( p ^ ) = p ( 1 − p ) n
Based on Central Limit Theorem we have:
P ^ ≈ N ( p , p ( 1 − p ) n ) ⟹ P ^ − P p ( 1 − p ) n
CI for p :
p ^ ± Z α 2 p ( 1 − p ) n
Example:
In a poll of 800 adults
45 % indicated that movies are getting better.
43 % indicated that movies are getting worse.
Find a 98 % CI for the proportion of all adults who think movies are getting better.
p ^ = 0.45
1 − a = 0.98
α = 0.02
α 2 = 0.01
σ p ^ = p ( 1 − p ) n
σ p ^ = 0.45 ( 1 − 0.45 ) 800 = 0.0175890590993379
CI for p :
0.45 ± Z α 2 p ( 1 − p ) n ≈ Z α 2 p ^ ( 1 − p ^ ) n
We need to use the plug-in estimate since we don't know p .
But we want a result in terms of known quantities.
0.45 ± 2.33 ⋅ 0.0175890590993379
0.45 ± 0.0409757533251168
= ( 0.409024246674883 , 0.490975753325117 )
Since we have less than 50 % for our CI.
Then even in the best case for our 98 % confidence interval, less than half of adults think movies are getting better.
Example:
Utility of mobile devices raises questions on intrusion of work into personal life.
158 473 of employees took work with them on vacation.
a:
What is the point estimate of the population proportion of all employees who take work with them on vacation?
p ^ = 158 473 = 0.334
b:
At 0.9 confidence, what is the margin of error for the population proportion of all employees who take work with them on vacation?
= Z α 2 p ^ ( 1 − p ^ ) n = 0.035
1 − α = 0.9
α = 0.1
α 2 = 0.05
Z α 2 = 1.65
( 0.334 ) ( 1 − 0.334 ) 473 = 0.0216860161877937
1.65 ⋅ 0.0216860161877937 = 0.035
CI:
0.334 ± 0.035 = ( 0.299 , 0.369 )
Example:
Aisha Shariff and Yvette Ye are candidates for mayor.
You are planning a small survey to determine the percent of voters to vote for shariff.
p is population proportion of voters who will vote for Shariff.
You want to be 95 % confident that your estimate of p is within 0.03 of the true value.
How large a sample should you take?
m = 0.03
1 − α = 0.95
α = 0.05
α 2 = 0.025
m = Z α 2 p ^ ( 1 − p ^ ) n
What is p ?
When we don't know that value of p ^
We can use 0.5
If you sketch the plot of p ( 1 − p )
At 50 % it's maximized.
( 1.96 0.03 ) 2 ( 0.5 ) ( 1 − 0.5 ) = 1067.11111111111
n = 1068
Sample size for an interval estimate of a population proportion:
n = Z ∗ E 2 p ∗ ( 1 − p ∗ )
Planning values p ∗ can be chosen by:
Sample proportion from a previous sample of the same or similar units
Use a planning value of 0.5 to maximize the required sample size when no reasonable estimate of p is available.
10 % condition
If less than 10 % of the population is sampled, the sample observations can be assumed to be independent.
We also consider n ≥ 30 in this fashion, n p ^ ≥ 10 and n ( 1 − p ^ ) ≥ 10 instead of the 30 rules for CLT .
Confidence Intervals for Variance
You know ( n − 1 ) S 2 σ 2 ∼ χ ( n − 1 ) 2 as long as we have a normal population from which we draw our sample.
We want P ( ? ≤ ( n − 1 ) S 2 σ 2 ≤ ? ) = 1 − α
We need to find two points on the chi-square dist.
Top value is:
Bottom value is:
P ( 1 χ ( 1 − α 2 ) 2 ≥ σ 2 ( n − 1 ) S 2 ≥ 1 χ ( α 2 ) 2 ) = 1 − α
P ( ( n − 1 ) S 2 χ ( 1 − α 2 ) 2 ≥ σ 2 ≥ ( n − 1 ) S 2 χ ( α 2 ) 2 )
CI:
[ ( n − 1 ) S 2 χ ( α 2 ) 2 , ( n − 1 ) S 2 χ ( 1 − α 2 ) 2 ]
Estimate σ 2 with CC 0.9
n = 3 means d f = 2
Y ¯ =
S 2 = 10.57
[ 3.53 , 205.25 ]
Crazy wide interval.
Maybe because Skewness as well as low samples.