One-Population Tests One Population Mean Proportion t Test Z - - PowerPoint PPT Presentation
One-Population Tests One Population Mean Proportion t Test Z - - PowerPoint PPT Presentation
One-Population Tests One Population Mean Proportion t Test Z Test Z Test (1 & 2 (1 & 2 (1 & 2 tail) tail) tail) One-sample test of proportion Z Test of Proportion Exact method using Binomial Distribution
One-Population Tests
One Population t Test
(1 & 2 tail)
Z Test
(1 & 2 tail)
Z Test Mean Proportion
(1 & 2 tail)
One-sample test of proportion
Z Test of Proportion Exact method using Binomial Distribution
Examples
Example 1. You’re an accounting manager. A year-end audit showed 4% of transactions had errors. You implement new
- procedures. A random sample of 500 transactions had 25 errors.
Has the proportion of incorrect transactions changed? Use the 0.05 significance level. H0: p = 0.04 vs. H1: p ≠ 0.04
Example 2. A researcher claims that less than 20% of adults in the U.S. are allergic to an herbal medicine. In a SRS of 25 adults, 3 say they have such an allergy. Does this support the researcher’s claim? Test at the 5% level. H0: p = 0. 2 vs. H1: p < 0. 2
Binomial Distribution
X ~ Binomial (n, p)
n = number of trials, p = probability of positive outcome Mean(X) = n p, Var(X) = n p(1- p)
X/n = p = proportion of positive outcomes in a sample of size n
= p (population proportion) Var( ) = p(1- p)/n By CLT, can be approximated by Normal:
if
^
5 ) 1 ( ≥ − p np ) ˆ ( E p
p ˆ
) ) 1 ( , ( ~ ˆ n p p p N p −
One-Sample Z Test for Proportion
Hypothesis: H0: p=p0 v.s. H1: p≠p0 Assumptions
Two Categorical Outcomes # of success follows Binomial distribution Normal approximation can be used If
5 ≥ q np
One-Sample Z Test for Proportion
Hypothesis: H0: p=p0 v.s. H1: p≠p0 Assumptions
Two Categorical Outcomes # of success Population Follows Binomial Distribution Normal Approximation Can Be Used If
Z-test statistic for proportion
5 ≥ q np
n ) p ( 1 p p p Z − ⋅ − = ˆ
Hypothesized population proportion
One-Sample Test of Proportion Example 1
You’re an accounting manager. A year-end audit showed
4% of transactions had errors. You implement new
- procedures. A random sample of 500 transactions had 25
- errors. Has the proportion of incorrect transactions
changed? Use the 0.05 significance level.
One-Sample Z Test of Proportion
H0: p=p0 = 0.04 Ha: p ≠ p0=0.04 α = .05 n = 500 Critical Value(s):
Test Statistic: Decision: Conclusion:
One-Sample Z Test of Proportion
H0: p=p0 = 0.04 Ha: p ≠ p0=0.04 α = .05 n = 500 Critical Value(s):
Test Statistic: Decision: Conclusion:
500*0.04*(1 0.04) 19.2 5 np q = − = ≥
Z Test
One-Sample Z Test of Proportion
H0: p=p0 = 0.04 Ha: p ≠ p0=0.04 α = .05 n = 500 Critical Value(s):
Test Statistic: Decision: Conclusion:
14 . 1 500 ) 04 . 1 ( 04 . 04 . 500 25 ˆ = − ⋅ − = − ⋅ − ≈ n ) p ( 1 p p p Z
500*0.04*(1 0.04) 19.2 5 np q = − = ≥
Z Test
One-Sample Z Test of Proportion
H0: p=p0 = 0.04 Ha: p ≠ p0=0.04 α = .05 n = 500 Critical Value(s):
Test Statistic: Decision: Conclusion:
Z 0 1.96
- 1.96
.025
Reject H0 Reject H0
.025
14 . 1 500 ) 04 . 1 ( 04 . 04 . 500 25 ˆ = − ⋅ − = − ⋅ − ≈ n ) p ( 1 p p p Z
500*0.04*(1 0.04) 19.2 5 np q = − = ≥
Z Test
One-Sample Z Test of Proportion
H0: p=p0 = 0.04 Ha: p ≠ p0=0.04 α = .05 n = 500 Critical Value(s):
Test Statistic: Decision: Conclusion:
Z 0 1.96
- 1.96
.025
Reject H0 Reject H0
.025
Do not reject at α = .05 There is no evidence proportion has changed from 4%
14 . 1 500 ) 04 . 1 ( 04 . 04 . 500 25 ˆ = − ⋅ − = − ⋅ − ≈ n ) p ( 1 p p p Z
500*0.04*(1 0.04) 19.2 5 np q = − = ≥
Z Test
One-sample test of Proportion Example 2
A researcher claims that less than 20% of adults in the U.S. are allergic to an herbal medicine. In a SRS of 25 adults, 3 say they have such an allergy. Does this support the researcher’s claim? Test at the 5% level.
Is ?
25 * 0.2 * 0.8 = 4
5 ≥ q np
Exact Method using Binomial Distribution-One sided p-value
If Normal approximation cannot be used, i.e. if then
Ha : p<p0 one-sided p-value=P(X ≤ x success in n trials | H0) EXCEL: BINOMDIST(x,n,p0,TRUE)
Ha : p >p0 one-sided p-value=P(X ≥ x success in n trials | H0) EXCEL: 1-BINOMDIST(x-1,n,p0,TRUE)
5 < q np
(1 )
x k n k k
n p p k
− =
= −
∑
(1 )
n k n k k x
n p p k
− =
= −
∑
Exact Method using Binomial Distribution-Two sided p-value
If Normal approximation cannot be used, i.e. if then for Ha : p≠ p0, the two sided pvalue can be calculated by
If p-value=2 P(X ≤ x success in n trials | H0) EXCEL: 2*BINOMDIST(x,n,p0,TRUE)
If p-value=2* P(X>=x successs in n trials | H0) EXCEL: 2*(1-BINOMDIST(x-1,n,p0,TRUE))
5 < q np
2 (1 )
x k n k k
n p p k
− =
= −
∑
2 (1 )
n k n k k x
n p p k
− =
= −
∑
ˆ x p p n = < ˆ x p p n = >
NOTE: TRUE: cumulative FALSE: probability mass
One-sample test of Proportion Example 2
A researcher claims that less than 20% of adults in the U.S. are allergic to an herbal medicine. In a SRS of 25 adults, 3 say they have such an allergy. Does this support the researcher’s claim? Test at the 5% level. n=25 p0=0.2 x=3
3 ˆ 0.12 25 x p n = = =
One-sample test of Proportion Example 2 Solution
H0: Ha: α = n = Decision: P-value = Conclusion:
One-sample test of Proportion Example 2 Solution
H0: p = p0=0.2 Ha: p< p0=0.2 α = .05 n = 25 x=3 Decision: P-value = Conclusion:
One-sample test of Proportion Example 2 Solution
H0: p = p0=0.2 Ha: p< p0=0.2 α = .05 n = 25 x=3 Decision: P-value = Conclusion:
k k k
k
− =
−
∑
25 3
) 2 . 1 ( 2 . 25
25 1 24 2 23 3 22
25 25 0.2 0.8 0.2 0.8 1 25 25 0.2 0.8 0.2 0.8 2 3 + + = +
EXCEL: BINOMDIST(3,25,0.2,TRUE)
One-sample test of Proportion Example 2 Solution
H0: p = p0=0.2 Ha: p0< p0=0.2 α = .05 n = 25 x=3 Decision: Do not reject at α = .05 P-value = Conclusion:
k k k
k
− =
−
∑
25 3
) 2 . 1 ( 2 . 25 05 . 234 . 8 . 2 . 3 25 8 . 2 . 2 25 8 . 2 . 1 25 8 . 2 . 25
22 3 23 2 24 1 25
> = + + + =
There is no evidence Proportion is less than 20% EXCEL: BINOMDIST(3,25,0.2,TRUE)
Review for Hypothesis Testing
One-sample tests for population mean, μ:
Z-test if σ is known T-test if σ is unknown
One-sample test for population proportion, p:
Z-test if npoqo ≥ 5 Exact method using Binomial distribution
Two-sample tests for difference in population means, μ1 – μ2:
Independent samples:
Z-test if σ’s are known T-test with pooled estimate of variance if σ’s are unknown and can be assumed
equal
T-test with unequal variances if σ’s are unknown and cannot be assumed equal
Paired samples:
Paired Z-test for the difference if σ is known Paired T-test for the difference if σ is unknown
Two-sample test for difference in population variances:
F-test with df1 = n1 – 1 and df2 = n2 – 1
Analysis of Variance (ANOVA)
Multisample Inference
Learning Objectives
Until now, we have considered two groups of individuals and we've wanted to know if the two groups were sampled from distributions with equal population means or medians.
Suppose we would like to consider more than two groups of individuals and, in particular, test whether the groups were sampled from distributions with equal population means.
How to use one-way analysis of variance (ANOVA) to test for differences among the means of several populations ( “groups”)
Hypotheses of One-Way ANOVA
All population means are equal No treatment effect (no variation in means among groups)
At least one population mean is different There is a treatment effect Does not mean that all population means are different
(some pairs may be the same)
H1 :Not all of the population means are the same
One-Factor ANOVA
All means are the same: The null hypothesis is true (No treatment effect)
One-Factor ANOVA
At least one mean is different: The null hypothesis is NOT true (Treatment effect is present)
- r
(continued)
One-Way ANOVA: Model Assumptions
The K random samples are drawn from K
independent populations
The variances of the populations are identical The underlying data are approximately normally
distributed
Basic Idea partitioning the variation
Suppose there are K groups with
- bservations.
= = =
- th observation in -th group,
- verall mean,
mean of group
ij i
y j i y y i
( ) (
)
= + − + −
ij i ij i
y y y y y y
Deviation of group mean from grand mean Deviation of
- bservations from
group mean
K
n n n ,..., ,
2 1
Partitioning the variation
( )
( )
− = − + −
ij ij i i
y y y y y y
yij − yi =
Deviation of observations from group mean (within group variability) Deviation of observations from overall mean (between group variability)
− =
i
y y
Group 1 Group 2 Group 3 Response, X
y
y1 y2 y3
Partitioning the variation
( ) ( )
( )
− = − + −
∑ ∑ ∑
2 2 2 ij ij i i
y y y y y y
Total variation (total SS) Variation due to random sampling (within SS) Variation due to factor (between SS)
Total variation is the sum of Within-group variability and Between- group variability
( )
( )
− = − + −
ij ij i i
y y y y y y
yij − yi =
Deviation of observations from group mean (within group variability) Deviation of observations from overall mean (between group variability)
− =
i
y y
Group 1 Group 2 Group 3 Response, X Group 1 Group 2 Group 3 Response, X
- If Between group
variability is large and Within group variability is small => reject Ho
- If Between group
variability is small and Within group variability is large => accept Ho
Basic Idea of ANOVA
Partition of Total Variation
Variation Due to Factor (Between SS) Variation Due to Random Sampling (Within SS)
Total Variation (total SS)
Commonly referred to as:
- Sum of Squares Within
- Sum of Squares Error
- Sum of Squares Unexplained
- Within-Group Variation
Commonly referred to as:
- Sum of Squares Between
- Sum of Squares Among
- Sum of Squares Explained
- Among Groups Variation
= +
d.f. = n – 1 d.f. = k – 1 d.f. = n – k
Total Sum of Squares
= =
= −
∑∑
2 1 1
( )
j
n k ij j i
Total SS y y
Where: Total SS = Total sum of squares
k = number of groups (levels or treatments) nj = number of observations in group j yij = ith observation from group j
= grand mean (mean of all data values)
y
Total SS = Between SS + Within SS
Total Variation
Group 1 Group 2 Group 3 Response, X
= − + − + + −
2 2 2 11 12
( ) ( ) ... ( )
k
kn
Total SS y y y y y y
y
Between-Group Variation
y1
Group 1 Group 2 Group 3 Response, X
= =
= − = − + − + + −
∑∑
2 2 2 2 1 1 2 2 1 1
( ) ( ) ( ) ... ( )
j
n k j k k j i
Between SS y y n y y n y y n y y
y2 y3 y
Within-Group Variation
1
Y
3
Y
Group 1 Group 2 Group 3 Response, X
= = =
= − = −
∑∑ ∑
2 2 1 1 1
( ) ( 1) *
i
n k k ij i i i i j i
Within SS y y n S
(continued) 2
Y
Obtaining the Mean Squares
Within MS = Within SS n − k Between MS = Between SS k −1 Total MS = Total SS n −1
One-Way ANOVA Table
Source of Variation df SS MS (Variance) Between Groups B SS BMS = Within Groups n - k W SS WMS = Total n - 1 TSS = BSS+WSS k - 1 BMS WMS F ratio
k = number of groups n = sum of the sample sizes from all groups df = degrees of freedom
BSS k - 1 WSS n - k F =
One-Way ANOVA F Test Statistic
Test statistic
Degrees of freedom
df1 = k – 1 (k = number of groups) df2 = n – k (n = sum of sample sizes from all populations)
=
1 2
,
~
df df
Between MS F F Within MS
Interpreting One-Way ANOVA F Statistic
The F statistic is the ratio of the among estimate of
variance and the within estimate of variance
The ratio must always be positive df1 = k -1 will typically be small df2 = n - k will typically be large FU is the critical value for α = .05
Decision Rule:
Reject H0 if F > FU Otherwise do not
reject H0
α = .05
Reject H0 Do not reject H0
FU
Example
You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the 0.05 significance level, is there a difference in mean distance? Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204
- Example
270 260 250 240 230 220 210 200 190
- Distance
Y 1 = 249.2 Y 2 = 226.0 Y 3 = 205.8 Y = 227.0
Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204
Club 1 2 3
Y 1 Y 2 Y 3 Y
Example
Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204
Y1 = 249.2 Y2 = 226.0 Y3 = 205.8 Y = 227.0 n1 = 5 n2 = 5 n3 = 5 n = 15 k = 3 B SS = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4 W SS = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6 BMS = 4716.4 / (3-1) = 2358.2 WMS = 1119.6 / (15-3) = 93.3
25.275 93.3 2358.2 F = =
- Test Statistic:
- Decision:
- Conclusion:
α = .05 FU = 3.89
Reject H0 Do not reject H0
Critical Value: FU = 3.89
Example
- H0: µ1 = µ2 = µ3
- H1: µj not all equal
- α = 0.05
- df1= 2, df2 = 12
- Critical Value
=FINV(0.05,2,12)=3.89
- Test Statistic:
- Decision:
- Conclusion:
α = .05 FU = 3.89
Reject H0 Do not reject H0
Critical Value: FU = 3.89
Example
- H0: µ1 = µ2 = µ3
- H1: µj not all equal
- α = 0.05
- df1= 2, df2 = 12
- Critical Value
=FINV(0.05,2,12)=3.89
F = BMS WMS = 2358.2 93.3 = 25.275
- Test Statistic:
- Decision:
- Conclusion:
α = .05 FU = 3.89
Reject H0 Do not reject H0
Critical Value: FU = 3.89
Example
- H0: µ1 = µ2 = µ3
- H1: µj not all equal
- α = 0.05
- df1= 2, df2 = 12
- Critical Value
=FINV(0.05,2,12)=3.89 F = 25.275
F = BMS WMS = 2358.2 93.3 = 25.275
- Test Statistic:
- Decision:
Reject H0 at α = 0.05
- Conclusion:
α = .05 FU = 3.89
Reject H0 Do not reject H0
Critical Value: FU = 3.89
Example
- H0: µ1 = µ2 = µ3
- H1: µj not all equal
- α = 0.05
- df1= 2, df2 = 12
- Critical Value
=FINV(0.05,2,12)=3.89 F = 25.275
F = BMS WMS = 2358.2 93.3 = 25.275
- Test Statistic:
- Decision:
Reject H0 at α = 0.05
- Conclusion:
There is evidence that at least one µj differs from the rest
α = .05 FU = 3.89
Reject H0 Do not reject H0
Critical Value: FU = 3.89
Example
- H0: µ1 = µ2 = µ3
- H1: µj not all equal
- α = 0.05
- df1= 2, df2 = 12
- Critical Value
=FINV(0.05,2,12)=3.89
F = BMS WMS = 2358.2 93.3 = 25.275
F = 25.275
One-Way ANOVA Table
Source of Variation df SS MS (Variance) Between Groups B SS BMS = Within Groups n - k W SS WMS = Total n - 1 TSS = BSS+WSS k - 1 BMS WMS F ratio
k = number of groups n = sum of the sample sizes from all groups df = degrees of freedom
BSS k - 1 WSS n - k F =
Source SS DF MS F P-value Between 4716.4 2 2358.2 25.76 <0.001 Within 1119.6 12 93.3 Total 5836.0
ANOVA Table
EXCEL ANOVA Analysis
EXCELDataData AnalysisANOVA: Single
Factor
EXCEL ANOVA Analysis Results
Anova: Single Factor SUMMARY Groups Count Sum Average Variance Column 1 5 1246 249.2 108.2 Column 2 5 1130 226 77.5 Column 3 5 1029 205.8 94.2 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 4716.4 2 2358.2 25.27546 4.99E-05 3.885294 Within Groups 1119.6 12 93.3 Total 5836 14
Comparisons of specific groups in One-way ANOVA
What happens when the null hypothesis is rejected? We conclude that the population means are not all
equal, but we cannot be more specific than this.
We often want to conduct additional tests to determine
where the differences lie.
We need to perform post hoc test to confirm where the
differences occurred between groups.
If the group variances are homogeneous, use