Stat 1 Report F 202

download Stat 1 Report F 202

of 25

Transcript of Stat 1 Report F 202

  • 8/3/2019 Stat 1 Report F 202

    1/25

    TERM PAPEROn

    Dispersion:A statistical tool

    Department of FinanceUniversity of Dhaka

  • 8/3/2019 Stat 1 Report F 202

    2/25

    Date of Submission: July 24, 2010

    Dr. M. Khairul HossainProfessorDepartment of FinanceUniversity of Dhaka

    GROUP-2Apel Mahmood Rifat 15-007Sumaiya Amena 15-051Shakira Mahzabeen 15-085Khairul Bashar 15-153

    Submitted To

    Submitted By

  • 8/3/2019 Stat 1 Report F 202

    3/25

    Letter of TransmittalJuly 24, 2010.Dr. M. Khairul Hossain

    Professor

    Department of Finance

    University of Dhaka.

    Subject: Submission of report named Dispersion: A statistical tool

    Dear Sir,

    We take the pleasure to inform you that, we are going to submit the report that you had

    assigned us as a partial requirement for the course Business Statistics I (F-202)

    The report is prepared on Dispersion: A statistical tool.

    We sincerely hope that, you will enjoy going through this report, as we have felt great

    pleasure to prepare it. If any other information is required for further clarification, we will bepleased to provide you with that.

    We are thanking you heartily. We tried our best to make this report the best one. We think

    this report can serve us all as a means of tool for solving business decision problems.

    Finally, we would like to thank you for providing us the opportunity to work in such an

    interesting and enthusiastic report as we have enjoyed as well as learned a lot in preparing

    this report.

    Sincerely,

    Apel Mahmood Rifat

    On behalf of group 2

    15th

    batch, Section-A

    Department of Finance

    University of Dhaka

  • 8/3/2019 Stat 1 Report F 202

    4/25

    AcknowledgementFor the completion of this study we cant deserve all praise. There were a lot of people whohelped us by providing valuable information, advice and guidance for the completion of this

    report in the scheduled time.

    Course report is an essential part of BBA program as one can gather practical knowledge

    within the short period of time by observing and doing the works of chosen topic. In this

    regard our report has been arranged on Dispersion.

    At first we like to pay our thanks to almighty Allah, for helping us to do all the works with

    perfection.

    We would like to pay our gratitude to our supervising course teacher Prof. Dr. M. KhairulHossain who instruct us in the right way and give us proper guidelines for preparing this

    report.

    At last we must mention the wonderful working environment and group commitment that has

    enabled a lot deal to do and observe the process during our time.

  • 8/3/2019 Stat 1 Report F 202

    5/25

    Table of Content

    SerialNo.

    Subject Page No.

    01. Executive Summary 0702. Introduction 0803. Macroenvironmental Forces 1104. Pharmaceuticals Industry 1905. Square Pharmaceuticals Ltd. 2306. Impact of Macroenvironment in Square Pharma 3207. Beximco Pharmaceuticals Ltd. 3708. Macro environmental factors affecting Beximco Parma to

    launch a new product

    48

    09. Findings 5810. Conclusion 5911. Reference 6012. Bibliography 61

  • 8/3/2019 Stat 1 Report F 202

    6/25

    Introduction

    Origin of the report

    This report is generated under the academic supervision of our course teacher Prof. Dr. M.Khairul Hossain, Department of Finance, University of Dhaka. This report is prepared as the

    requirement of Business statistics course. The topic is Dispersion: A statistical tool.

    Methodology

    The methodology of the report is inductive. The report is based on secondary information.

    Secondary Information:The secondary sources of data are different reference books,

    website etc.

    Key Parts of the report

    The main view of the report is to discuss Dispersion, as a statistical tool. Different measures

    of dispersion and their use is discussed in this report.

    Objectives of the report

    Broad Objectives:The main objective of the study is to evaluate the impact of macro-

    environment forces in decision making of launching new product in pharmaceutical industry.

    Specific Objectives:

    To be acquainted with the Pharmaceutical industry

    To learn clear knowledge of macro-environment forces

    To learn about new product launching process of Pharmaceutical industry

    To have the practical knowledge of theoretical knowledge of Marketing theory

    ScopeIn this report, at first we cover the preliminary concept of Dispersion. Then we go for the

    classification of dispersion factors on launching a new product of a pharmaceuticals

    company.

    LimitationsThere were certain limitations of the problem we face in report preparing.

    Unavoidable conditions:

    Some of the unavoidable conditions also had a deterring effect on preparing thereport.

  • 8/3/2019 Stat 1 Report F 202

    7/25

    Restrictions that we faced:

    Lack of information, lack of technology etc. are the restrictions within the problem.

    Absence of some information regarding data compilation:

    While making the survey for data collection, we have faced problems. Some of the

    information was really essential was hard to collect.

  • 8/3/2019 Stat 1 Report F 202

    8/25

    Introduction

    While measures of central tendency are used to estimate "normal" values of a dataset,

    measures of dispersion are important for describing the spread of the data, or its variation

    around a central value. Two distinct samples may have the same mean or median, but

    completely different levels of variability, or vice versa. A proper description of a set of data

    should include both of these characteristics. There are various methods that can be used to

    measure the dispersion of a dataset, each with its own set of advantages and disadvantages.

    In statistics, statistical dispersion (also called statistical variability or variation) is variability

    or spread in a variable or a probability distribution. Common examples of measures of

    statistical dispersion are the variance, standard deviation and inter quartile range.

    Measures of dispersion express quantitatively the degree of variation or dispersion of values

    in a population or in a sample. Along with measures ofcentral tendency, measures of

    dispersion are widely used in practice as descriptive statistics. Some measures of dispersionare the standard deviation, the average deviation, the range, the interquartile range.

    For example, the dispersion in the sample of 5 values (98,99,100,101,102) is smaller than the

    dispersion in the sample (80,90,100,110,120), although both samples have the same central

    location - "100", as measured by, say, the mean or the median . Most measures of dispersion

    would be 10 times greater for the second sample than for the first one (although the values

    themselves may be different for different measures of dispersion).

    Dispersion is contrasted with location or central tendency, and together they are the most

    used properties of distributions.

    A measure of statistical dispersion is a real number that is zero if all the data are identical,

    and increases as the data becomes more diverse. It cannot be less than zero.

    Most measures of dispersion have the same scale as the quantity being measured. In other

    words, if the measurements have units, such as meters or seconds, the measure of dispersion

    has the same units.

    http://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Variable_%28mathematics%29http://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Interquartile_rangehttp://www.statistics.com/resources/glossary/p/population.phphttp://www.statistics.com/resources/glossary/s/sample.phphttp://www.statistics.com/resources/glossary/c/centralt.phphttp://www.statistics.com/resources/glossary/d/descstats.phphttp://www.statistics.com/resources/glossary/s/standev.phphttp://www.statistics.com/resources/glossary/a/avgdev.phphttp://www.statistics.com/resources/glossary/r/range.phphttp://www.statistics.com/resources/glossary/i/intrqrtrng.phphttp://www.statistics.com/resources/glossary/m/mean.phphttp://www.statistics.com/resources/glossary/m/median.phphttp://en.wikipedia.org/wiki/Central_tendencyhttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Units_of_measurementhttp://en.wikipedia.org/wiki/Units_of_measurementhttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Central_tendencyhttp://www.statistics.com/resources/glossary/m/median.phphttp://www.statistics.com/resources/glossary/m/mean.phphttp://www.statistics.com/resources/glossary/i/intrqrtrng.phphttp://www.statistics.com/resources/glossary/r/range.phphttp://www.statistics.com/resources/glossary/a/avgdev.phphttp://www.statistics.com/resources/glossary/s/standev.phphttp://www.statistics.com/resources/glossary/d/descstats.phphttp://www.statistics.com/resources/glossary/c/centralt.phphttp://www.statistics.com/resources/glossary/s/sample.phphttp://www.statistics.com/resources/glossary/p/population.phphttp://en.wikipedia.org/wiki/Interquartile_rangehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Variable_%28mathematics%29http://en.wikipedia.org/wiki/Statistics
  • 8/3/2019 Stat 1 Report F 202

    9/25

    Importance

    A study of dispersion enables us to get additional information about the composition of data.

    Confining mean will not provide us this vital information.

    Central tendency will only give information on the location of the data. Dispersion defines

    the spread of the data. In addition, shape should also be part of the defining criteria of data.

    So, dispersion describes location, spread & shape as best measures to define data.

    Two different set of data can have different mean but same variability. On the other hand two

    set of data can have same mean but different variability.

    Shape A and B has the same mean but different variability

    Curve ACurve B

    Curve B

    Curve A

    Curve A and B have different mean but same variability.

  • 8/3/2019 Stat 1 Report F 202

    10/25

    Variability or variation is something connected with human life and study is very important

    for mankind. The total area of the earth may not be very important to a research minded

    person but the area under different crops, area covered by forests, area covered by residential

    and commercial buildings are figures of great importance because these figures keep on

    changing form time to time and from place to place. Very large number of experts is engaged

    in the study of changing phenomenon. Experts working in different countries of the worldkeep a watch on forces which are responsible for bringing changes in the fields of human

    interest. The agricultural, industrial and mineral production and their transportation from one

    part to the other parts of the world are the matters of great interest to the economists,

    statisticians, and other experts. The changes in human population, the changes in standard

    living, and changes in literacy rate and the changes in price attract the experts to make

    detailed studies about them and then correlate these changes with the human life. Thus

    variability or variation is something connected with human life and study is very important

    for mankind.

    The study of dispersion is very important in statistical data. Like-

    Test the reliability of an average

    Control the variability

    Compare two or more sets of data with respect of their variability

    Facilitate the use of other statistical techniques

    If in a certain factory there is consistence in the wages of workers, the workers will besatisfied. But if some workers have high wages and some have low wages, there will be

    unrest among the low paid workers and they might go on strikes and arrange demonstrations.

    If in a certain country some people are very poor and some are very high rich, we say there is

    economic disparity. It means that dispersion is large.

    The idea of dispersion is important in the study of wages of workers, prices of commodities,

    standard of living of different people, distribution of wealth, distribution of land among

    framers and various other fields of life.

    Measures of dispersion are known as averages of the second order because they indicate the

    average deviation of individual observations from the mean.

    Measures of dispersion can be described from two perspectives. They are:-

    1. Absolute form

    2. Relative form

    A graphical representation is in the following:-

  • 8/3/2019 Stat 1 Report F 202

    11/25

    Range: Considering the several measures of dispersion, the range is the first measure of the absolute

    form. The range is based on the largest and the smallest values in the data set. It is known as the

    simplest measure of dispersion. However, the range only provides information about the maximum

    and minimum values and does not say anything about the values in between. It isthe difference

    between the largest and the smallest values in a data set. In the form of an equation, after re-

    arranging the data, it will be like this:

    The range is widely used in statistical process control (SPC) applications because it is very easy to

    calculate and understand.

    Quartile Deviation: The quartile deviation is half the difference between the upper and lower

    quartiles in a distribution. It is a measure of the spread through the middle half of a distribution. It

    can be useful because it is not influenced by extremely high or extremely low scores. Quartile

    Deviation is an ordinal statistic and is most often used in conjunction with the median. The formula

    to calculate quartile deviation is:

    Where, QD = Quartile Deviation

    Q3 = Third Quartile

    Q1 = First Quartile

    Measures ofDispersion

    AbsoulteForm

    RangeQuartile

    DeviationMean

    DeviationStandardDeviation

    RelativeForm

    Coefficientof Range

    Coefficientof QuartileDeviation

    Coefficientof MeanDeviation

    Coefficientof

    Variation

    Range = Largest value Smallest value

  • 8/3/2019 Stat 1 Report F 202

    12/25

    Mean Deviation: A defect of the range is that it is based on only two values, the highest and the

    lowest. It does not take into consideration all of the values. The mean deviation does. It measures

    the mean amount by which the values in a population or sample vary from their mean. In terms of a

    definition, mean deviation is the arithmetic mean of the absolute values of the deviations from the

    arithmetic mean. The formula is:-

    MD = l X X l

    n

    Where, X is the value of each observation

    X is the arithmetic mean of the values

    n is the number of observations in the sample

    ll indicates the absolute value

    Standard Deviation:The variance and the standard deviation are also based on the deviationsfrom the mean. However, instead of using the absolute value of the deviations, the variance and the

    standard deviation square the deviations.

    Features of standard deviation are as follows:

    The standard deviation is the square root of the sample variance. Defined so that it can be used to make inferences about the population variance.

    Calculated using the formula:

    The values computed in the squared term, x i x bar, are anomalies, which is discussed in

    another section

    Not restricted to large sample data sets, compared to the root mean square anomaly

    http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Climatologies/http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Climatologies/
  • 8/3/2019 Stat 1 Report F 202

    13/25

    Variance: The arithmetic mean of the squared deviations from the mean is known as the variance.The variance is nonnegative and is zero only if all the observations are the same. The formula is:-

  • 8/3/2019 Stat 1 Report F 202

    14/25

    Measures of Relative Dispersion

    A measure of relative variation is the ratio of the measure of the absolute variation

    to an average. It is sometimes called the co-efficient of variation because co-efficient

    means a pure number that is independent of the unit of measurement. It should beremembered that while computing the relative variation the average used as base

    should be the same one from which the absolute variations were measured.

    The relative variations are:

    Coefficient of range

    The relative measure corresponding to a range called the coefficient of range, is

    obtained by applying the following formula:

    Coefficient of range =

    In a frequency distribution, coefficient of range is calculated by taking the difference

    between the lower limit of the lower class and the upper limit of the upper class.

    Example:

    The following are the prices of shares of a company from Monday to Saturday:

    Day Price Day Price

    Monday 200 Thursday 160

    Tuesday 210 Friday 220

    Coeffcient of range

    Coeffcient of mean deviation

    Coeffcient of quartile deviation

    Coeffcient of variance

  • 8/3/2019 Stat 1 Report F 202

    15/25

    Wednesday 208 Saturday 250

    Solution:

    Range= L S Here, Largest value = 160 and

    =250 160 Smallest value = 250

    Coefficient of range =

    =

    =0.219

    Coefficient of quartile deviation

    The relative measure corresponding to a quartile deviation called the coefficient of

    quartile deviation is calculated as follows:

    Coefficient of quartile deviation =

    Coefficient of quartile deviation can be used to compare the degree of variation in

    different distributions.

    Coefficient of mean deviation

    The relative measure corresponding to a mean deviation called the coefficient of

    mean deviation is calculated as follows:

    Coefficient of mean deviation =

    If mean has been used while calculating the value of mean deviation in such a case

    coefficient of mean deviation can be obtained by diving average deviation by the

    mean.

    Coefficient of variation

    The relative measure corresponding to a variation is called the coefficient of

    variation. This measure developed by Karl Pearson is the most commonly used

    measure of relative variation. It is used in such problems where we want to compare

  • 8/3/2019 Stat 1 Report F 202

    16/25

    the variability of two or more than two series. Coefficient of variation denoted by C.V

    is obtained as follows:

    C.V. =

    Percentile: If the data are organized in ascending form and then which single data

    divides the information into hundred, it is called percentile.

    Percentile = () ; i= 1,2,3,..,99If the

    () is in fraction, then

    Percentile= value + *( ) + For frequency distribution,

    Percentile=

    Example: Find the percentile of 2, 4,6,8,10,12,14,16,18.Solution: Here, n= 9

    Percentile= () th value=value=10.

    Decile: In descriptive statistics, a decile is any of the nine values that divide the sorted data

    into ten equal parts, so that each part represents 1/10 of the sample or population. Thus:

    The 1st decile cuts off the lowest 10% of data, i. e., the 10th percentile.

    The 5th decile cuts off lowest 50% of data, i. e., the 50th percentile, 2nd quartile, or

    median.

    The 9th decile cuts off lowest 90% of data, i. e., the 90th percentile.

    http://en.wikipedia.org/wiki/Descriptive_statisticshttp://en.wikipedia.org/wiki/Descriptive_statistics
  • 8/3/2019 Stat 1 Report F 202

    17/25

    Empirical Rule:

    Provides significant information into the distribution of data around the mean,

    approximating normality.1. The mean one standard deviation contains approximately 68.26% of the

    measurements in the series.

    2. The mean two standard deviations contain approximately 95.5% of the

    measurements in the series.

    3. The mean three standard deviations contain approximately 99.7% of themeasurements in the series.

    Climatologists often use standard deviations to help classify abnormal climatic

    conditions. The chart below describes the abnormality of a data value by how many

    standard deviations it is located away from the mean. The probabilities in the third

    column assume the data is normally distributed.

  • 8/3/2019 Stat 1 Report F 202

    18/25

    Standard Deviations Away

    From Mean

    Abnormality Probability of Occurance

    beyond -3 sd extremely subnormal 0.15%

    -3 to -2 sd greatly subnormal 2.35%

    -2 to -1 sd subnormal 13.5%-1 to +1 sd normal 68.0%

    +1 to +2 sd above normal 13.5%

    +2 to +3 sd greatly above normal 2.35%

    beyond +3 sd extremely above normal 0.15%Oliver, John E. Climatology: Selected Applications. p 45.

    Chebyshevs Theorem: A large standard deviation reveals that the observations are widely scatteredabout the mean. The Russian mathematician P. L. Chebyshev (1821-1894) developed a theorem that

    allows us to determine the minimum proportion of the values that lie within specified number of

    standard deviations of the mean. For example, according to Chebyshevs Theorem, at least three of

    four values, or 75 percent, must lie between the mean plus two standard deviations and the mean

    minus two standard deviations. This relationship applies regardless of the shape of the distribution.

    Further, at least eight of nine values, o 88.9 percent will lie between plus three standard deviations

    and minus three standard deviations of the mean. At least 24 of 25 values, or 96 percent, will lie

    between plus and minus five standard deviations of the mean.

    For any set of observations, the proportion of the values lie within k standard deviations of the mean

    is at least 1 1/k2

    , where k is any constant greater than 1.

  • 8/3/2019 Stat 1 Report F 202

    19/25

    Which Measure of Variation to Use

    The choice of a suitable measure of dispersion depends on the following three

    factors:

    1. The type of data available: If observations are few in numbers, avoid thestandard deviation. If they are generally skewed, avoid the mean deviation as

    well. If they have gaps around the quartiles, the quartile deviation should be

    avoided. If there are open-end classes, the quartile measure of variation

    should be preferred.

    2. The purpose of investigation: In an elementary treatment of statistical seriesin which a measure of variability is desired only for itself, any of the three

    measures, namely, range, mean deviation, quartile deviation would be

    acceptable. Probably the man deviation would be superior. In usual practice,

    the measure of variability is employed in further statistical analysis. For such

    a purpose, the standard deviation is by far the most popularly used. It is free

    from those defects with which other measures suffer. It lends itself to the

    analysis of variability in terms of normal curve of error. Practically, all

    advanced statistical methods deal with variability and centre around the

    standard deviation. Hence, unless the circumstances warrant for the use of

    any other measure, we should make use of standard deviation for measuring

    variability.

  • 8/3/2019 Stat 1 Report F 202

    20/25

    A tabular format comparison among the measures of dispersion is drawn in the following:

    Characteristics Range Quartile

    Deviation

    Mean Deviation Standard

    DeviationClear Definition Yes Yes Yes Yes

    Easily

    Understandable

    Yes Yes No Yes

    Determination

    Procedures

    Easy Average Average Not that easy

    For further

    Algebraic Process

    Not Eligible Not Eligible Not Eligible Eligible

    Usage of all item

    in a data set

    No No Yes Yes

    Effect of extreme

    values

    Yes Not much Yes Not much

    Effect of sample

    fluctuations

    No Not much Not much Not much

    From the above discussions, it is seen that standard deviation supports almost all the characteristics

    of an ideal measures of dispersion. Therefore, we can say that, standard deviation is the ideal

    measure of dispersion.

  • 8/3/2019 Stat 1 Report F 202

    21/25

    Some practical applications of Measures of Dispersion:

    The following data show the lifetime of laptops of two different brands.

    Life time

    (Years)

    No. of laptop

    Dell HP

    2 4

    4 6

    6 8

    8 10

    10-12

    20

    15

    25

    30

    35

    15

    20

    20

    25

    15

    i. Find which of the brands shows a greater lifetime?ii. Which of the brands you would prefer if the prices were same? Why?

    Solution:

    The brand, which has greater mean, has the greater lifetime. If the prices were same thebrand which has less variability has to be preferred. The brand which has less coefficient of

    variance has less variability.

    At first let us take Compaq,

    Life time (Years) Dell (f) X fX 2 4 20 3 60 180

    4 6 15 5 75 375

    6 8 25 7 175 1225

    8 10 30 9 270 2430

    10-12 35 11 385 4235

  • 8/3/2019 Stat 1 Report F 202

    22/25

    =

    C.V.

  • 8/3/2019 Stat 1 Report F 202

    23/25

    Life time (Years) HP(f) X fX 2 4 15 3 45 135

    4 6 20 5 100 500

    6 8 20 7140 980

    8 10 25 9 225 2025

    10-12 15 11165 1815

    95 675 5455

    =

    C.V.

  • 8/3/2019 Stat 1 Report F 202

    24/25

    Dell has the mean 7.72 and HP has the mean 7.11. As Dell has the greater mean than HP, so

    Dell has the greater lifetime.

    The covariance of Dell is 36.53% and the covariance of HP is 36.99%. If the prices were

    same, Dell is more preferable as it has less variability and it indicates better quality and

    higher consistency.

    Application of Empirical Rule:

    A sample of the rental rates at University Park Apartments approximates a symmetrical,

    bell-shaped distribution. The sample mean is 500 taka and the standard deviation is 20 taka.

    Using the empirical rule, we have to determine:

    1. About 68 percent of the monthly food expenditures are between what two amounts?

    2. About 95 percent of the monthly food expenditures are between what two amounts?

    3. Almost all of the monthly expenditures are between what two amounts?

    Solution:

    1. About 68 percent of the monthly food expenditures are between

    X 1s = 500 1(20)

    That is 480 and 520 taka

    2. About 95 percent of the monthly food expenditures are between

    X 2s = 500 2(20)

    That is 460 and 540 taka

  • 8/3/2019 Stat 1 Report F 202

    25/25

    3. Almost all (99.7 percent) are between

    X 3s = 500 3(20)

    That is 440 and 560 taka.