QUANT SEM 4

download QUANT SEM 4

of 26

Transcript of QUANT SEM 4

  • 7/28/2019 QUANT SEM 4

    1/26

    QUANTITATIVE DATA ANALYSIS

    HARSHAD BAJPAI

  • 7/28/2019 QUANT SEM 4

    2/26

  • 7/28/2019 QUANT SEM 4

    3/26

    A statistic is a number summarizing a bunch ofvalues.

    Simple or univariate statistics summarize values ofone variable.

    Effect or outcome statistics summarize therelationship between values of two or more

    variables. Simple statistics for numeric variables

    Mean: the average

    Standard deviation: the typical variation

    Standard error of the mean: the typical variation inthe mean with repeated sampling

    Multiply by (sample size) to convert to standarddeviation.

  • 7/28/2019 QUANT SEM 4

    4/26

    Simple statistics for nominal variables

    Frequencies, proportions, or odds.

    Can also use these for ordinal variables.

    Effect statistics are the one which depict the

    relarionship between two or more variables, orhow one effects he other. Ex co-relation

    coefficitent, regression etc

    Test statistics are use to test hypothesis

  • 7/28/2019 QUANT SEM 4

    5/26

    Common descriptive statistics

    Count (frequencies)

    Percentage

    Mean

    Mode

    Median

    Range

    Standard deviation

    Variance

    Ranking

  • 7/28/2019 QUANT SEM 4

    6/26

    Model: numeric vs numerice.g. body fat vs sum of skinfolds

    Model or test:linear regression

    Effect statistics:

    slope and intercept

    = parameters correlation coefficient or variance explained (=

    100correlation2)= measures of goodness of fit

    Other statistics:

    typical or standard error of the estimate= residual error= best measure ofvalidity (with criterion variable on the Yaxis)

  • 7/28/2019 QUANT SEM 4

    7/26

    Correlation

    The concept of correlation is a statistical toolwhich studies the Relationship between twovariables and

    Correlation Analysis involves various methodsand techniques used for studying and measuringthe extent of the relationship between the twovariables.

    Two variables are said to be in correlation if thechange in one of the variables results in a changein the other variable

  • 7/28/2019 QUANT SEM 4

    8/26

    Types of Correlation

    There are two important types of correlation.

    They are

    (1) Positive and Negative correlation and

    (2) Linear and Non Linear correlation.

  • 7/28/2019 QUANT SEM 4

    9/26

    Positive and Negative Correlation

    If the values of the two variables deviate inthe same direction

    i.e. if an increase (or decrease) in the values of

    one variable results, on an average, in acorresponding increase (or decrease) in the

    values of the other variable the correlation is

    said to be positive.

  • 7/28/2019 QUANT SEM 4

    10/26

    Examples

    Some examples of series of positive

    correlation are:-

    (i) Heights and weights;

    (ii) Household income and expenditure;

    (iii) Price and supply of commodities;

    (iv) Amount of rainfall and yield of crops.

  • 7/28/2019 QUANT SEM 4

    11/26

    Negative correlation

    Correlation between two variables is

    said to be negative or inverse if the

    variables deviate in opposite direction. That is, if the increase in the variables

    deviate in opposite direction. That is, if

    increase (or decrease) in the values ofone variable results on an average, in

    corresponding decrease (or increase) in

    the values of other variable.

  • 7/28/2019 QUANT SEM 4

    12/26

    Examples

    Some examples of series of negative

    correlation are:

    (i) Volume and pressure of perfect gas;

    (ii) Current and resistance [keeping the voltage

    constant

    (iii) Price and demand of goods

  • 7/28/2019 QUANT SEM 4

    13/26

    The relationship between two variables is said tobe non linear ifcorresponding to a unit changein one variable, the other variable does not

    change at a constant rate but changes at afluctuating rate. In such cases, if the data isplotted on a graph sheet we will not get a straightline curve.

    For example, one may have a relation of the form y = a + bx + cx2

    or more general polynomial.

  • 7/28/2019 QUANT SEM 4

    14/26

    The Coefficient of Correlation

    One of the most widely used statistics is thecoefficient of correlation r which measures the

    degree of association between the two values ofrelated variables given in the data set. It takesvalues from + 1 to 1. If two sets or data have

    r = +1, they are said to be perfectly correlated

    positively ifr = -1 they are said to be perfectlycorrelated negatively; and if r = 0 they areuncorrelated.

  • 7/28/2019 QUANT SEM 4

    15/26

    variance

  • 7/28/2019 QUANT SEM 4

    16/26

    In statistical significance testing, the p-value isthe probability of obtaining a test statistic at leastas extreme as the one that was actuallyobserved, assuming that the null hypothesis is

    true. In this context, value a is considered more"extreme" than b ifa is less likely to occur underthe null. One often "rejects the null hypothesis"when the p-value is less than the significance

    level (Greek alpha), which is often 0.05 or 0.01.When the null hypothesis is rejected, the result issaid to bestatistically significant.

    http://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significance
  • 7/28/2019 QUANT SEM 4

    17/26

    Null hypothesis H0: < 24

    Alternative hypothesis

    Ha

    : >24Similarly different

    combinations

  • 7/28/2019 QUANT SEM 4

    18/26

    Errors

    Type I error

    Rejecting Ho when Ho is true

    Type II error-

    Accepting Ho when Ha is true

  • 7/28/2019 QUANT SEM 4

    19/26

    Level of significance

    Level of significance is the probability of

    making a Type-I error.

    Denoted by the symbol .

    The person doing the hypothesis testing

    specifies the value of.

  • 7/28/2019 QUANT SEM 4

    20/26

    Value of alfa

    If the cost of making the type I error is high,

    lower values of are preferred.

    If the cost is low the higher values are

    preferred.

    Like in case of critical components in

    automobiles, the values should be low, to

    ensure errors are no entertained

  • 7/28/2019 QUANT SEM 4

    21/26

    Standard error

    Standard error is the standard deviation of the

    sample.

    S.E = / n

    SE is the of the mean (X), that is called SE to

    signify how much the Mean varies.

  • 7/28/2019 QUANT SEM 4

    22/26

    One tailed test

    Lower tail test left

    Upper tailed test right

    Z = standard normal variate / test stastic Z = x- / x

    Z=-1, means the value of X is one standard

    error below mean. Z=-2, means the value of X is two standard

    error below mean.

  • 7/28/2019 QUANT SEM 4

    23/26

    Criterion

    P valueits a probability, computed using z,

    holds acceptance and rejection criterion

  • 7/28/2019 QUANT SEM 4

    24/26

    Two approaches

    1) p-value approach

    2) critical value approach

  • 7/28/2019 QUANT SEM 4

    25/26

    Criteria

    Reject Ho if p-value < alfa( level of

    significance)

    P-value is called the observed level of

    significance

    P-value calculation depends upon he type of

    the test

  • 7/28/2019 QUANT SEM 4

    26/26

    Lower tail test

    For lower tail test the p-value is the

    probability of obtaining a value of test statistic

    at least as small as that provided by the

    sample.