10 -t dist

download 10 -t dist

of 44

Transcript of 10 -t dist

  • 8/7/2019 10 -t dist

    1/44

    Introduction to ProbabilityIntroduction to Probabilityand Statisticsand StatisticsTenth EditionTenth Edition

    Chapter 10Inference fromSmall Samples

  • 8/7/2019 10 -t dist

    2/44

    IntroductionIntroduction When the sample size is small, the

    estimation and testing procedures of Chapter 8 are not appropriate.

    There are equivalent small sample test andestimation procedures for

    , the mean of a normal population 1 2 , the difference between two

    population means 2, the variance of a normal populationThe ratio of two population variances.

  • 8/7/2019 10 -t dist

    3/44

    The Sampling DistributionThe Sampling Distributionof the Sample Meanof the Sample Mean

    When we take a sample from a normal population, the sample meanhas a normal distribution for any sample size n, and

    has a standard normal distribution. But if is unknown, and we must use s to estimate it, the resulting

    statisticis not normalis not normal .

    nxz

    / = normal!not is

    / nsx

    x

  • 8/7/2019 10 -t dist

    4/44

    Students t DistributionStudents t Distribution Fortunately, this statistic does have a sampling distribution that is

    well known to statisticians, called the Students t distribution,Students t distribution, with nn -1 degrees of freedom.-1 degrees of freedom.

    ns

    xt

    /

    =

    We can use this distribution to createestimation testing procedures for the populationmean .

  • 8/7/2019 10 -t dist

    5/44

    Properties of StudentsProperties of Students t t

    Shape depends on the sample size n or the

    degrees of freedom,degrees of freedom, nn -1.-1. As n increases the shapes of the t and z

    distributions become almost identical.

    Mound-shapedMound-shaped andsymmetric about 0.More variable thanMore variable than z z ,with heavier tails

  • 8/7/2019 10 -t dist

    6/44

    Using theUsing the t t -Table-Table Table 4 gives the values of t that cut off certain

    critical values in the tail of the t distribution. Index df df and the appropriate tail area aa to find

    t t aa ,,the value of t with area a to its right.

    For a random sample of size n =10, find a value of t that cutsoff .025 in the right tail.

    Row = df = n 1 = 9

    t .025 = 2.262

    Column subscript = a = .025

  • 8/7/2019 10 -t dist

    7/44

    Small Sample InferenceSmall Sample Inference

    for a Population Meanfor a Population Mean The basic procedures are the same as those usedfor large samples. For a test of hypothesis:

    .1on withdistributi-ta

    on basedregionrejection aor values- using/

    statistictesttheusing

    tailedor two one :H versus:HTest

    0

    a00

    =

    =

    =

    ndf

    pns

    xt

  • 8/7/2019 10 -t dist

    8/44

    For a 100(1 )% confidence intervalfor the population mean :

    .1on withdistributi-ta of tailin the

    /2 area off cutsthat of valuetheis where 2/

    2/

    =

    ndf

    t t n

    st x

    Small Sample InferenceSmall Sample Inferencefor a Population Meanfor a Population Mean

  • 8/7/2019 10 -t dist

    9/44

    ExampleExample

    A sprinkler system is designed so that the averagetime for the sprinklers to activate after beingturned on is no more than 15 seconds. A test of 5

    systems gave the following times:17, 31, 12, 17, 13, 25

    Is the system working as specified? Test using = .05.

    specified) as ng(not worki 15:H

    specified) as (working 15:H

    a

    0

    >=

  • 8/7/2019 10 -t dist

    10/44

    ExampleExampleFirst, calculate the sample mean and standard deviation,using your calculator or the formulas in Chapter 2.

    387.75

    6

    1152477

    1

    )(

    167.196115

    222

    =

    =

    =

    ==

    =

    nn

    xx

    s

    nx

    x i

  • 8/7/2019 10 -t dist

    11/44

    ExampleExampleCalculate the test statistic and find the rejection region for

    =.05.

    5161 38.16/387.7

    15167.19

    /

    :freedom of Degrees :statisticTest

    0

    ====

    =

    = ndf nsx

    t

    Rejection Region: Reject H 0 if t >2.015. If the test statistic falls inthe rejection region, its p-valuewill be less than = .05.

    Rejection Region: Reject H 0 if t >2.015. If the test statistic falls inthe rejection region, its p-valuewill be less than = .05.

  • 8/7/2019 10 -t dist

    12/44

    ConclusionConclusionCompare the observed test statistic to the rejection region,

    and draw conclusions.

    .015.2 if HReject

    :RegionRejection

    38.1 :statisticTest

    0 >

    =

    t

    t

    Conclusion: For our example, t = 1.38 does not fall in therejection region and H 0 is not rejected. There is insufficientevidence to indicate that the average activation time is greater than 15.

    15:H 15:H

    a

    0

    >=

  • 8/7/2019 10 -t dist

    13/44

    Approximating theApproximating thepp-value-value

    You can only approximate the p-valuefor the test using Table 4.

    Since the observed valueof t = 1.38 is smaller than t .10 = 1.476,

    p-value > .10.

  • 8/7/2019 10 -t dist

    14/44

    The exactThe exact pp-value-value You can get the exact p-valueusing some calculators or a computer.

    One-Sample T: TimesTest of mu = 15 vs > 15

    95%

    Lower

    Variable N Mean StDev SE Mean Bound T P

    Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113

    One-Sample T: TimesTest of mu = 15 vs > 15

    95%

    Lower

    Variable N Mean StDev SE Mean Bound T P

    Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113

    p-value = .113 which

    is greater than .10 aswe approximatedusing Table 4.

  • 8/7/2019 10 -t dist

    15/44

    Testing the DifferenceTesting the Difference

    between Two Meansbetween Two Means

    normal. bemust spopulation twothesmall, are sizes sample theSince

    .22

    and21

    variancesand 2

    and 1

    meanswith2 and 1 spopulation from

    drawn are 2

    and 1

    size of samples randomtindependen 9,Chapter in As

    nn

    To test:

    H0: 1 2 = D0 versus H a: one of threewhere D 0 is some hypothesized difference,usually 0.

  • 8/7/2019 10 -t dist

    16/44

    Testing the DifferenceTesting the Difference

    between Two Meansbetween Two MeansThe test statistic used in Chapter 9

    does not have either a z or a t distribution, and

    cannot be used for small-sample inference.We need to make one more assumption, thatthe population variances, although unknown,are equal.

    2

    22

    1

    21

    21z

    ns

    ns

    xx

    +

  • 8/7/2019 10 -t dist

    17/44

    Testing the DifferenceTesting the Difference

    between Two Meansbetween Two MeansInstead of estimating each population varianceseparately, we estimate the common variancewith

    2)1()1(

    21

    222

    2112

    ++=

    nnsnsn

    s

    +

    =

    21

    2

    021

    11

    nns

    Dxxt has a t distributionwith n1+n2-2 degreesof freedom.

    And the resultingtest statistic,

  • 8/7/2019 10 -t dist

    18/44

    Estimating the DifferenceEstimating the Difference

    between Two Meansbetween Two MeansYou can also create a 100(1- )% confidenceinterval for 1- 2.

    2)1()1(

    with21

    222

    2112

    ++=

    nnsnsn

    s

    +

    21

    22/21

    11)(

    nnst xx

    Remember the three

    assumptions:1. Original populations

    normal

    2. Samples random andindependent

    3. Equal populationvariances.

    Remember the three

    assumptions:1. Original populations

    normal

    2. Samples random andindependent

    3. Equal populationvariances.

  • 8/7/2019 10 -t dist

    19/44

    ExampleExample Two training procedures are compared by

    measuring the time that it takes trainees toassemble a device. A different group of trainees aretaught using each method. Is there a difference in thetwo methods? Use = .01.

    Time toAssemble

    Method 1 Method 2

    Sample size 10 12

    Sample mean 35 31

    Sample Std Dev 4.9 4.5

    0:H 210 =

    +

    =

    21

    2

    21

    11

    0

    :statisticTest

    nns

    xxt

    0:H 21a

  • 8/7/2019 10 -t dist

    20/44

    ExampleExample Solve this problem by approximating the p-

    value usingTable 4.

    Time toAssemble Method 1 Method 2

    Sample size 10 12

    Sample mean 35 31

    Sample Std Dev 4.9 4.5

    99.1

    121

    101

    942.21

    3135

    :statisticTest

    =

    +

    =

    t

    942.2120

    )5.4(11)9.4(92

    )1()1(

    :Calculate

    22

    21

    222

    2112

    =+=+

    +=nn

    snsns

  • 8/7/2019 10 -t dist

    21/44

    ExampleExample

    value)-(21

    )99.1(

    )99.1()99.1( :value-

    pt P

    t P t P p

    =>

    .025 < ( p-value) < .05

    .05 < p-value < .10

    Since the p-value is

    greater than = .01, H 0 is not rejected. There isinsufficient evidence toindicate a difference inthe population means.

    .025 < ( p-value) < .05

    .05 < p-value < .10

    Since the p-value isgreater than = .01, H 0 is not rejected. There isinsufficient evidence toindicate a difference inthe population means.

    df = n1 + n2 2 = 10 + 12 2 = 20df = n1 + n2 2 = 10 + 12 2 = 20

  • 8/7/2019 10 -t dist

    22/44

  • 8/7/2019 10 -t dist

    23/44

    Testing the DifferenceTesting the Difference

    between Two Meansbetween Two MeansIf the population variances cannot be assumedequal, the test statistic

    has an approximate t distribution with degreesof freedom given above. This is most easilydone by computer.

    2

    22

    1

    21

    21

    ns

    ns

    xxt

    +

    1)/(

    1)/(

    2

    22

    22

    1

    21

    21

    2

    2

    2

    2

    1

    2

    1

    +

    +

    nns

    nns

    nsnsdf

  • 8/7/2019 10 -t dist

    24/44

    The Paired-DifferenceThe Paired-DifferenceTestTest

    Sometimes the assumption of independentsamples is intentionally violated, resulting in amatched-pairsmatched-pairs or paired-difference testpaired-difference test.By designing the experiment in this way, we caneliminate unwanted variability in the experimentby analyzing only the differences,

    d d ii == xx11ii xx22iito see if there is a difference in the twopopulation means, 1 2.

  • 8/7/2019 10 -t dist

    25/44

    ExampleExample

    One Type A and one Type B tire are randomly assignedto each of the rear wheels of five cars. Compare theaverage tire wear for types A and B using a test of hypothesis.

    Car 1 2 3 4 5

    Type A 10.6 9.8 12.3 9.7 8.8

    Type B 10.2 9.4 11.8 9.1 8.3

    0:H

    0:H

    21a

    210

    =

    But the samples are notindependent. The pairs of responses are linked becausemeasurements are taken on thesame car.

  • 8/7/2019 10 -t dist

    26/44

    The Paired-DifferenceThe Paired-Difference

    TestTest

    .1on withdistributi-ta

    on basedregionrejection aor value- theUse. s,difference theof deviation standard andmean

    theare and pairs, of number where

    /0

    statistictesttheusing

    0 :Htestwe0:HtestTo d0210

    =

    =

    =

    ==

    ndf

    pd

    sd n

    nsd t

    i

    d

    d

  • 8/7/2019 10 -t dist

    27/44

    ExampleExampleCar 1 2 3 4 5

    Type A 10.6 9.8 12.3 9.7 8.8

    Type B 10.2 9.4 11.8 9.1 8.3

    Difference .4 .4 .5 .6 .5

    0:H

    0:H

    21a

    210

    =

    ( ).0837

    .48 Calculate

    =

    =

    ==

    1

    22

    nnd

    d s

    n

    d d

    ii

    d

    i8.12

    5/0837.

    048.

    /

    0

    :statisticTest

    =

    =

    =ns

    d t

    d

  • 8/7/2019 10 -t dist

    28/44

    ExampleExampleCar Car 1 2 3 4 5

    Type A 10.6 9.8 12.3 9.7 8.8

    Type B 10.2 9.4 11.8 9.1 8.3

    Difference .4 .4 .5 .6 .5

    Rejection region: Reject H0

    if t > 2.776 or t < -2.776.

    Conclusion: Since t = 12.8, H 0 is rejected. There is a

    difference in the average tirewear for the two types of tires.

  • 8/7/2019 10 -t dist

    29/44

    Some NotesSome Notes

    You can construct a 100(1- )% confidenceinterval for a paired experiment using

    Once you have designed the experiment bypairing, you MUST analyze it as a pairedexperiment. If the experiment is not designed as apaired experiment in advance, do not use thisprocedure.

    n

    st d d 2/

  • 8/7/2019 10 -t dist

    30/44

    Inference ConcerningInference Concerning

    a Population Variancea Population VarianceSometimes the primary parameter of interestis not the population mean but rather thepopulation variance 2. We choose a randomsample of size n from a normal distribution.The sample variance s2 can be used in itsstandardized form:

    2

    2

    2)1(

    sn

    =

    which has a Chi-Square distribution with n - 1degrees of freedom.

  • 8/7/2019 10 -t dist

    31/44

    Inference ConcerningInference Concerning

    a Population Variancea Population VarianceTable 5 gives both upper and lower criticalvalues of the chi-square statistic for a given df .

    For example, the value of chi-square that cuts off .05 in the upper tail of thedistribution with df = 5 is 2 =11.07.

  • 8/7/2019 10 -t dist

    32/44

    Inference ConcerningInference Concerning

    a Population Variancea Population Variance

    .1on withdistributi square-chi a

    on basedregionrejection awith)1(

    statistictesttheuse we

    tailedor two one :H versus:HtestTo

    20

    22

    a20

    20

    =

    =

    =

    ndf

    sn

    2

    )2/1(

    22

    2

    2/

    2 )1()1(

    :interval Confidence

  • 8/7/2019 10 -t dist

    33/44

    ExampleExampleA cement manufacturer claims that his cementhas a compressive strength with a standarddeviation of 10 kg/cm 2 or less. A sample of n =10 measurements produced a mean and standarddeviation of 312 and 13.96, respectively.

    A test of hypothesis:

    H0: 2 = 10 (claim iscorrect)

    Ha: 2 > 10 (claim is

    wrong)

    A test of hypothesis:

    H0: 2 = 10 (claim iscorrect)

    Ha: 2 > 10 (claim iswrong)

    uses the test statistic:uses the test statistic:

    5.17100

    )96.13(910

    )1( 22

    22 === sn

  • 8/7/2019 10 -t dist

    34/44

  • 8/7/2019 10 -t dist

    35/44

    Approximating theApproximating thep-p- valuevalue

    91with)5.17( :value- 2 ==> ndf P p

    .025 < p-value < .05Since the p-value is lessthan = .05, H 0 is notrejected. There issufficient evidence toreject the manufacturersclaim.

    .025 < p-value < .05Since the p-value is lessthan = .05, H 0 is notrejected. There is

    sufficient evidence toreject the manufacturersclaim.

  • 8/7/2019 10 -t dist

    36/44

    Inference ConcerningInference Concerning

    Two Population VariancesTwo Population VariancesWe can make inferences about the ratio of two population variances in the form a ratio.We choose two independent random samplesof size n1 and n2 from normal distributions.If the two population variances are equal, thestatistic

    22

    21

    ss

    F =

    has an F distribution with df 1 = n1 - 1 and df 2 =

    n2 - 1 degrees of freedom.

  • 8/7/2019 10 -t dist

    37/44

    Inference ConcerningInference Concerning

    Two Population VariancesTwo Population VariancesTable 6 gives only upper critical values of theF statistic for a given pair of df 1 and df 2.

    For example, the value of F that cuts off .05 in theupper tail of the

    distribution with df 1 = 5and df 2 = 8 is F =3.69.

  • 8/7/2019 10 -t dist

    38/44

    Inference ConcerningInference ConcerningTwo Population VariancesTwo Population Variances

    .1 and 1on withdistributi anon basedregionrejection awith

    .variancessample twotheof larger theis where

    statistictesttheuse wetailedor two one :H versus:HtestTo

    2211

    2

    122

    21

    a22

    210

    ==

    =

    =

    ndf ndf F

    sss

    F

    12

    21

    ,22

    21

    22

    21

    ,22

    21 1

    :interval Confidence

    df df df df

    F ss

    F ss

  • 8/7/2019 10 -t dist

    39/44

    ExampleExample

    An experimenter has performed a labexperiment using two groups of rats. He wantsto test H 0: 1 = 2, but first he wants tomake sure that the population variances areequal. Standard (2) Experimental (1)

    Sample size 10 11

    Sample mean 13.64 12.42

    Sample Std Dev 2.3 5.8

    22

    21a

    22

    210 :H versus:H

    :y testPreliminar

    =

  • 8/7/2019 10 -t dist

    40/44

    ExampleExampleStandard (2) Experimental (1)

    Sample size 10 11

    Sample Std Dev 2.3 5.8

    22

    21a

    2

    2

    2

    10

    :H

    :H

    =

    36.63.28.5

    :statis ticTest

    2

    2

    22

    21 ===

    ss

    F

    We designate the sample with the larger standarddeviation as sample 1, to force the test statisticinto the upper tail of the F distribution.

    We designate the sample with the larger standarddeviation as sample 1, to force the test statisticinto the upper tail of the F distribution.

  • 8/7/2019 10 -t dist

    41/44

    ExampleExample

    22

    21a

    2

    2

    2

    10

    :H :H

    = 36.63.28.5

    :statisticTest

    2

    2

    22

    21 ===

    ss

    F

    The rejection region is two-tailed, with = .05, but we onlyneed to find the upper critical value, which has /2 = .025 toits right.

    From Table 6, with df 1=10 and df 2 = 9, we reject H 0 if F >3.96.

    CONCLUSION: Reject H 0. There is sufficient evidence toindicate that the variances are unequal. Do not rely on the

    assumption of equal variances for your t test!

    The rejection region is two-tailed, with = .05, but we onlyneed to find the upper critical value, which has /2 = .025 toits right.

    From Table 6, with df 1=10 and df 2 = 9, we reject H 0 if F >

    3.96.

    CONCLUSION: Reject H 0. There is sufficient evidence toindicate that the variances are unequal . Do not rely on theassumption of equal variances for your t test!

  • 8/7/2019 10 -t dist

    42/44

    Key ConceptsKey ConceptsI. Experimental Designs for Small SamplesI. Experimental Designs for Small Samples

    1. Single random sample: The sampled populationmust be normal.2. Two independent random samples: Both sampled

    populations must be normal.a. Populations have a common variance 2.b. Populations have different variances

    3. Paired-difference or matched-pairs design: Thesamples are not independent.

    K CK C

  • 8/7/2019 10 -t dist

    43/44

    Key ConceptsKey ConceptsII. Statistical Tests of SignificanceII. Statistical Tests of Significance

    1. Based on the t , F , and 2

    distributions2. Use the same procedure as in Chapter 93. Rejection region critical values and significance levels:based on the t, F, and 2 distributions with the appropriate

    degrees of freedom4. Tests of population parameters: a single mean, thedifference between two means, a single variance, and theratio of two variances

    III. Small Sample Test StatisticsIII. Small Sample Test StatisticsTo test one of the population parameters when the sample sizesare small, use the following test statistics:

  • 8/7/2019 10 -t dist

    44/44

    Key ConceptsKey Concepts