Post on 08-Apr-2018
8/7/2019 10 -t dist
1/44
Introduction to ProbabilityIntroduction to Probabilityand Statisticsand StatisticsTenth EditionTenth Edition
Chapter 10Inference fromSmall Samples
8/7/2019 10 -t dist
2/44
IntroductionIntroduction When the sample size is small, the
estimation and testing procedures of Chapter 8 are not appropriate.
There are equivalent small sample test andestimation procedures for
, the mean of a normal population 1 2 , the difference between two
population means 2, the variance of a normal populationThe ratio of two population variances.
8/7/2019 10 -t dist
3/44
The Sampling DistributionThe Sampling Distributionof the Sample Meanof the Sample Mean
When we take a sample from a normal population, the sample meanhas a normal distribution for any sample size n, and
has a standard normal distribution. But if is unknown, and we must use s to estimate it, the resulting
statisticis not normalis not normal .
nxz
/ = normal!not is
/ nsx
x
8/7/2019 10 -t dist
4/44
Students t DistributionStudents t Distribution Fortunately, this statistic does have a sampling distribution that is
well known to statisticians, called the Students t distribution,Students t distribution, with nn -1 degrees of freedom.-1 degrees of freedom.
ns
xt
/
=
We can use this distribution to createestimation testing procedures for the populationmean .
8/7/2019 10 -t dist
5/44
Properties of StudentsProperties of Students t t
Shape depends on the sample size n or the
degrees of freedom,degrees of freedom, nn -1.-1. As n increases the shapes of the t and z
distributions become almost identical.
Mound-shapedMound-shaped andsymmetric about 0.More variable thanMore variable than z z ,with heavier tails
8/7/2019 10 -t dist
6/44
Using theUsing the t t -Table-Table Table 4 gives the values of t that cut off certain
critical values in the tail of the t distribution. Index df df and the appropriate tail area aa to find
t t aa ,,the value of t with area a to its right.
For a random sample of size n =10, find a value of t that cutsoff .025 in the right tail.
Row = df = n 1 = 9
t .025 = 2.262
Column subscript = a = .025
8/7/2019 10 -t dist
7/44
Small Sample InferenceSmall Sample Inference
for a Population Meanfor a Population Mean The basic procedures are the same as those usedfor large samples. For a test of hypothesis:
.1on withdistributi-ta
on basedregionrejection aor values- using/
statistictesttheusing
tailedor two one :H versus:HTest
0
a00
=
=
=
ndf
pns
xt
8/7/2019 10 -t dist
8/44
For a 100(1 )% confidence intervalfor the population mean :
.1on withdistributi-ta of tailin the
/2 area off cutsthat of valuetheis where 2/
2/
=
ndf
t t n
st x
Small Sample InferenceSmall Sample Inferencefor a Population Meanfor a Population Mean
8/7/2019 10 -t dist
9/44
ExampleExample
A sprinkler system is designed so that the averagetime for the sprinklers to activate after beingturned on is no more than 15 seconds. A test of 5
systems gave the following times:17, 31, 12, 17, 13, 25
Is the system working as specified? Test using = .05.
specified) as ng(not worki 15:H
specified) as (working 15:H
a
0
>=
8/7/2019 10 -t dist
10/44
ExampleExampleFirst, calculate the sample mean and standard deviation,using your calculator or the formulas in Chapter 2.
387.75
6
1152477
1
)(
167.196115
222
=
=
=
==
=
nn
xx
s
nx
x i
8/7/2019 10 -t dist
11/44
ExampleExampleCalculate the test statistic and find the rejection region for
=.05.
5161 38.16/387.7
15167.19
/
:freedom of Degrees :statisticTest
0
====
=
= ndf nsx
t
Rejection Region: Reject H 0 if t >2.015. If the test statistic falls inthe rejection region, its p-valuewill be less than = .05.
Rejection Region: Reject H 0 if t >2.015. If the test statistic falls inthe rejection region, its p-valuewill be less than = .05.
8/7/2019 10 -t dist
12/44
ConclusionConclusionCompare the observed test statistic to the rejection region,
and draw conclusions.
.015.2 if HReject
:RegionRejection
38.1 :statisticTest
0 >
=
t
t
Conclusion: For our example, t = 1.38 does not fall in therejection region and H 0 is not rejected. There is insufficientevidence to indicate that the average activation time is greater than 15.
15:H 15:H
a
0
>=
8/7/2019 10 -t dist
13/44
Approximating theApproximating thepp-value-value
You can only approximate the p-valuefor the test using Table 4.
Since the observed valueof t = 1.38 is smaller than t .10 = 1.476,
p-value > .10.
8/7/2019 10 -t dist
14/44
The exactThe exact pp-value-value You can get the exact p-valueusing some calculators or a computer.
One-Sample T: TimesTest of mu = 15 vs > 15
95%
Lower
Variable N Mean StDev SE Mean Bound T P
Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113
One-Sample T: TimesTest of mu = 15 vs > 15
95%
Lower
Variable N Mean StDev SE Mean Bound T P
Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113
p-value = .113 which
is greater than .10 aswe approximatedusing Table 4.
8/7/2019 10 -t dist
15/44
Testing the DifferenceTesting the Difference
between Two Meansbetween Two Means
normal. bemust spopulation twothesmall, are sizes sample theSince
.22
and21
variancesand 2
and 1
meanswith2 and 1 spopulation from
drawn are 2
and 1
size of samples randomtindependen 9,Chapter in As
nn
To test:
H0: 1 2 = D0 versus H a: one of threewhere D 0 is some hypothesized difference,usually 0.
8/7/2019 10 -t dist
16/44
Testing the DifferenceTesting the Difference
between Two Meansbetween Two MeansThe test statistic used in Chapter 9
does not have either a z or a t distribution, and
cannot be used for small-sample inference.We need to make one more assumption, thatthe population variances, although unknown,are equal.
2
22
1
21
21z
ns
ns
xx
+
8/7/2019 10 -t dist
17/44
Testing the DifferenceTesting the Difference
between Two Meansbetween Two MeansInstead of estimating each population varianceseparately, we estimate the common variancewith
2)1()1(
21
222
2112
++=
nnsnsn
s
+
=
21
2
021
11
nns
Dxxt has a t distributionwith n1+n2-2 degreesof freedom.
And the resultingtest statistic,
8/7/2019 10 -t dist
18/44
Estimating the DifferenceEstimating the Difference
between Two Meansbetween Two MeansYou can also create a 100(1- )% confidenceinterval for 1- 2.
2)1()1(
with21
222
2112
++=
nnsnsn
s
+
21
22/21
11)(
nnst xx
Remember the three
assumptions:1. Original populations
normal
2. Samples random andindependent
3. Equal populationvariances.
Remember the three
assumptions:1. Original populations
normal
2. Samples random andindependent
3. Equal populationvariances.
8/7/2019 10 -t dist
19/44
ExampleExample Two training procedures are compared by
measuring the time that it takes trainees toassemble a device. A different group of trainees aretaught using each method. Is there a difference in thetwo methods? Use = .01.
Time toAssemble
Method 1 Method 2
Sample size 10 12
Sample mean 35 31
Sample Std Dev 4.9 4.5
0:H 210 =
+
=
21
2
21
11
0
:statisticTest
nns
xxt
0:H 21a
8/7/2019 10 -t dist
20/44
ExampleExample Solve this problem by approximating the p-
value usingTable 4.
Time toAssemble Method 1 Method 2
Sample size 10 12
Sample mean 35 31
Sample Std Dev 4.9 4.5
99.1
121
101
942.21
3135
:statisticTest
=
+
=
t
942.2120
)5.4(11)9.4(92
)1()1(
:Calculate
22
21
222
2112
=+=+
+=nn
snsns
8/7/2019 10 -t dist
21/44
ExampleExample
value)-(21
)99.1(
)99.1()99.1( :value-
pt P
t P t P p
=>
.025 < ( p-value) < .05
.05 < p-value < .10
Since the p-value is
greater than = .01, H 0 is not rejected. There isinsufficient evidence toindicate a difference inthe population means.
.025 < ( p-value) < .05
.05 < p-value < .10
Since the p-value isgreater than = .01, H 0 is not rejected. There isinsufficient evidence toindicate a difference inthe population means.
df = n1 + n2 2 = 10 + 12 2 = 20df = n1 + n2 2 = 10 + 12 2 = 20
8/7/2019 10 -t dist
22/44
8/7/2019 10 -t dist
23/44
Testing the DifferenceTesting the Difference
between Two Meansbetween Two MeansIf the population variances cannot be assumedequal, the test statistic
has an approximate t distribution with degreesof freedom given above. This is most easilydone by computer.
2
22
1
21
21
ns
ns
xxt
+
1)/(
1)/(
2
22
22
1
21
21
2
2
2
2
1
2
1
+
+
nns
nns
nsnsdf
8/7/2019 10 -t dist
24/44
The Paired-DifferenceThe Paired-DifferenceTestTest
Sometimes the assumption of independentsamples is intentionally violated, resulting in amatched-pairsmatched-pairs or paired-difference testpaired-difference test.By designing the experiment in this way, we caneliminate unwanted variability in the experimentby analyzing only the differences,
d d ii == xx11ii xx22iito see if there is a difference in the twopopulation means, 1 2.
8/7/2019 10 -t dist
25/44
ExampleExample
One Type A and one Type B tire are randomly assignedto each of the rear wheels of five cars. Compare theaverage tire wear for types A and B using a test of hypothesis.
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
0:H
0:H
21a
210
=
But the samples are notindependent. The pairs of responses are linked becausemeasurements are taken on thesame car.
8/7/2019 10 -t dist
26/44
The Paired-DifferenceThe Paired-Difference
TestTest
.1on withdistributi-ta
on basedregionrejection aor value- theUse. s,difference theof deviation standard andmean
theare and pairs, of number where
/0
statistictesttheusing
0 :Htestwe0:HtestTo d0210
=
=
=
==
ndf
pd
sd n
nsd t
i
d
d
8/7/2019 10 -t dist
27/44
ExampleExampleCar 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5
0:H
0:H
21a
210
=
( ).0837
.48 Calculate
=
=
==
1
22
nnd
d s
n
d d
ii
d
i8.12
5/0837.
048.
/
0
:statisticTest
=
=
=ns
d t
d
8/7/2019 10 -t dist
28/44
ExampleExampleCar Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5
Rejection region: Reject H0
if t > 2.776 or t < -2.776.
Conclusion: Since t = 12.8, H 0 is rejected. There is a
difference in the average tirewear for the two types of tires.
8/7/2019 10 -t dist
29/44
Some NotesSome Notes
You can construct a 100(1- )% confidenceinterval for a paired experiment using
Once you have designed the experiment bypairing, you MUST analyze it as a pairedexperiment. If the experiment is not designed as apaired experiment in advance, do not use thisprocedure.
n
st d d 2/
8/7/2019 10 -t dist
30/44
Inference ConcerningInference Concerning
a Population Variancea Population VarianceSometimes the primary parameter of interestis not the population mean but rather thepopulation variance 2. We choose a randomsample of size n from a normal distribution.The sample variance s2 can be used in itsstandardized form:
2
2
2)1(
sn
=
which has a Chi-Square distribution with n - 1degrees of freedom.
8/7/2019 10 -t dist
31/44
Inference ConcerningInference Concerning
a Population Variancea Population VarianceTable 5 gives both upper and lower criticalvalues of the chi-square statistic for a given df .
For example, the value of chi-square that cuts off .05 in the upper tail of thedistribution with df = 5 is 2 =11.07.
8/7/2019 10 -t dist
32/44
Inference ConcerningInference Concerning
a Population Variancea Population Variance
.1on withdistributi square-chi a
on basedregionrejection awith)1(
statistictesttheuse we
tailedor two one :H versus:HtestTo
20
22
a20
20
=
=
=
ndf
sn
2
)2/1(
22
2
2/
2 )1()1(
:interval Confidence
8/7/2019 10 -t dist
33/44
ExampleExampleA cement manufacturer claims that his cementhas a compressive strength with a standarddeviation of 10 kg/cm 2 or less. A sample of n =10 measurements produced a mean and standarddeviation of 312 and 13.96, respectively.
A test of hypothesis:
H0: 2 = 10 (claim iscorrect)
Ha: 2 > 10 (claim is
wrong)
A test of hypothesis:
H0: 2 = 10 (claim iscorrect)
Ha: 2 > 10 (claim iswrong)
uses the test statistic:uses the test statistic:
5.17100
)96.13(910
)1( 22
22 === sn
8/7/2019 10 -t dist
34/44
8/7/2019 10 -t dist
35/44
Approximating theApproximating thep-p- valuevalue
91with)5.17( :value- 2 ==> ndf P p
.025 < p-value < .05Since the p-value is lessthan = .05, H 0 is notrejected. There issufficient evidence toreject the manufacturersclaim.
.025 < p-value < .05Since the p-value is lessthan = .05, H 0 is notrejected. There is
sufficient evidence toreject the manufacturersclaim.
8/7/2019 10 -t dist
36/44
Inference ConcerningInference Concerning
Two Population VariancesTwo Population VariancesWe can make inferences about the ratio of two population variances in the form a ratio.We choose two independent random samplesof size n1 and n2 from normal distributions.If the two population variances are equal, thestatistic
22
21
ss
F =
has an F distribution with df 1 = n1 - 1 and df 2 =
n2 - 1 degrees of freedom.
8/7/2019 10 -t dist
37/44
Inference ConcerningInference Concerning
Two Population VariancesTwo Population VariancesTable 6 gives only upper critical values of theF statistic for a given pair of df 1 and df 2.
For example, the value of F that cuts off .05 in theupper tail of the
distribution with df 1 = 5and df 2 = 8 is F =3.69.
8/7/2019 10 -t dist
38/44
Inference ConcerningInference ConcerningTwo Population VariancesTwo Population Variances
.1 and 1on withdistributi anon basedregionrejection awith
.variancessample twotheof larger theis where
statistictesttheuse wetailedor two one :H versus:HtestTo
2211
2
122
21
a22
210
==
=
=
ndf ndf F
sss
F
12
21
,22
21
22
21
,22
21 1
:interval Confidence
df df df df
F ss
F ss
8/7/2019 10 -t dist
39/44
ExampleExample
An experimenter has performed a labexperiment using two groups of rats. He wantsto test H 0: 1 = 2, but first he wants tomake sure that the population variances areequal. Standard (2) Experimental (1)
Sample size 10 11
Sample mean 13.64 12.42
Sample Std Dev 2.3 5.8
22
21a
22
210 :H versus:H
:y testPreliminar
=
8/7/2019 10 -t dist
40/44
ExampleExampleStandard (2) Experimental (1)
Sample size 10 11
Sample Std Dev 2.3 5.8
22
21a
2
2
2
10
:H
:H
=
36.63.28.5
:statis ticTest
2
2
22
21 ===
ss
F
We designate the sample with the larger standarddeviation as sample 1, to force the test statisticinto the upper tail of the F distribution.
We designate the sample with the larger standarddeviation as sample 1, to force the test statisticinto the upper tail of the F distribution.
8/7/2019 10 -t dist
41/44
ExampleExample
22
21a
2
2
2
10
:H :H
= 36.63.28.5
:statisticTest
2
2
22
21 ===
ss
F
The rejection region is two-tailed, with = .05, but we onlyneed to find the upper critical value, which has /2 = .025 toits right.
From Table 6, with df 1=10 and df 2 = 9, we reject H 0 if F >3.96.
CONCLUSION: Reject H 0. There is sufficient evidence toindicate that the variances are unequal. Do not rely on the
assumption of equal variances for your t test!
The rejection region is two-tailed, with = .05, but we onlyneed to find the upper critical value, which has /2 = .025 toits right.
From Table 6, with df 1=10 and df 2 = 9, we reject H 0 if F >
3.96.
CONCLUSION: Reject H 0. There is sufficient evidence toindicate that the variances are unequal . Do not rely on theassumption of equal variances for your t test!
8/7/2019 10 -t dist
42/44
Key ConceptsKey ConceptsI. Experimental Designs for Small SamplesI. Experimental Designs for Small Samples
1. Single random sample: The sampled populationmust be normal.2. Two independent random samples: Both sampled
populations must be normal.a. Populations have a common variance 2.b. Populations have different variances
3. Paired-difference or matched-pairs design: Thesamples are not independent.
K CK C
8/7/2019 10 -t dist
43/44
Key ConceptsKey ConceptsII. Statistical Tests of SignificanceII. Statistical Tests of Significance
1. Based on the t , F , and 2
distributions2. Use the same procedure as in Chapter 93. Rejection region critical values and significance levels:based on the t, F, and 2 distributions with the appropriate
degrees of freedom4. Tests of population parameters: a single mean, thedifference between two means, a single variance, and theratio of two variances
III. Small Sample Test StatisticsIII. Small Sample Test StatisticsTo test one of the population parameters when the sample sizesare small, use the following test statistics:
8/7/2019 10 -t dist
44/44
Key ConceptsKey Concepts