You are on page 1of 29

Statistics for Management

Sukhbir Kaur
Processing of Data
 Editing
– Field editing-of the forms. For example, fields left out,
writing not clear, abbreviations used.
– Central editing-after all the forms have been
collected.For example, entry in the wrong place, entry
recorded in months when it should be weeks,etc.
 Coding-assigning numerals to answers
 Classification
 Tabulation
 Using percentages
Presentation of Data
 Classification of data
– Geographical i.e region wise. For example, population
region wise.
– Chronological i.e time wise. For example, sales year
wise.
– Qualitative i.e according to attributes. For example
attribute education-primary, middle, higher secondary,
university etc. Population on basis of employment-
employed and unemployed.
– Quantitative i.e according to magnitudes. For example,
classifying employees as per salaries.
Presentation of Data
 Quantitative data can be further classified as discrete or
continuous.
– Discrete data: limited values of variable. For example,
no. of employees, no. of machines
– Continuous data: can take all values of the variable. For
example, weight, distance, volume
No. of No. of Age No. of
employees companies (years) workers
110 25 Continuous 20-25 15
Discrete frequency
frequency 120 35 25-30 22
distribution
distribution 130 70 30-35 38
140 100 35-40 47
150 18 40-45 18
160 12 45-50 10
Construction of a discrete frequency
distribution
 Sample of 50 families was surveyed to find the number of
children per family. Make a discrete frequency distribution
table for the data below:

3 2 2 1 3 4 2 1 3 4 5 0 2
1 2 3 3 2 1 1 2 3 0 3 2 1
4 3 5 5 4 3 6 5 4 3 1 0 6
5 4 3 1 2 0 1 2 3 4 5
Construction of a discrete frequency
distribution
No. of children No. of families Frequency
(counts /tally
marks)
0 4
1 9
2 10
3 12
4 7
5 6
6 2
Construction of a continuous
frequency distribution
 Class limits:indicate lowest and highest value that
can be included in the class. For example, 60-69,
60 is the lower limit and 69 the upper.
 Class intervals: width of a class=upper limit-lower
limit
 Class frequency: no. of observations falling within
a particular class.
 Class mid point:value lying midway between
upper and lower class limits.
Construction of a continuous
frequency distribution
 Type of class intervals
 Exclusive: in this the upper limit in each class is excluded.
Sales No. of companies
20-25k 20
Firm with 25k sales will be
25-30k 28 included in the class 25-30k
30-35k 35

 Inclusive:in this the lower and upper limit are included in


the class itself.
Sales No. of companies
20-24999 20
25-29999 28
Construction of a continuous
frequency distribution
 Type of class intervals
 Open end:in open end distribution, the lower
limit of the first class and upper limit of last
class is not given.For example, Income:
Less than 5000
5000-10000
10000-15000
15000-20000
More than 20000
Construction of a continuous
frequency distribution
 Cumulative frequency (c.f):salary of employees per month
in thousands Rs, no. of employees.
Salary No. c.f r.f(%) salary No. c.f r.f(%)
10-12 5 5 2.5 20-22 25 169 12.5
12-14 14 19 7 22-24 22 191 11
14-16 23 42 11.5 24-26 7 198 3.5
16-18 50 92 25 26-28 2 200 1
18-20 52 144 26

 Relative frequency (r.f)-shows percentage of each class (eg


(5/200)*100)
Measures of Central Tendency
 Singlevalue to give idea of entire data. For
example, average income.
 For comparing two or more sets of data. For
example, average sales of Aug and Sep.
 Properties of measure of central tendency:
– Easy to understand
– Simple to compute
– Based on all observations
– Uniquely defined
– Not unduly affected by extreme values
– Capable of further mathematical treatment.
Measures of Central Tendency
 Following common measures:
– Arithmetic mean
– Weighted arithmetic mean= ∑ (wx)/ ∑ w, w=weights
– Median
– Mode
– Quartiles
 Arithmetic mean X=(∑ X)/N for ungrouped data
 And = (∑ fX)/N for grouped data
 Where N = ∑ f, X is mid point of various classes, f is
frequency of corresponding class.
Exercise on arithmetic mean
 Find arithmetic mean (Rs in lakhs)
Sales No. of firmsSales No. of firms
monthly monthly
300-350 5 550-600 25
350-400 14 600-650 22
400-450 23 650-700 7
450-500 50 700-750 2
500-550 52
Exercise on arithmetic mean
 To simplify use
 X (X bar) = A + {(∑ fd)/N} * i
 Where A is arbitrary point (assumed mean), d= (X-A)/i, i =
size of class interval, or X=A+id
Sales Mid point f d=(X-525)/50 fd Let A=525
300-350 325 5 -4 -20
350-400 375 14 -3 -42
∑ fd=-60, ∑ f=200
400-450 425 23 -2 -46
450-500 475 50 -1 -50
Xbar=525-(60/200)*50
500-550 525 52 0 0
=510
550-600 575 25 1 25
600-650 625 22 2 44
650-700 675 7 3 21
700-750 725 2 4 8
Median
 Median is that value which divides distribution into two
equal parts. 50% of observations are above and 50% below
the value of median.
 For example, income: 1100, 1200, 1350, 1500, 1550, 1600,
1800.
 Median income is 1500……. Arrange in ascending order
and then pick (n+1)/2 for odd numbers.
 Suppose eighth person with income of Rs 1850 joins.
 Median=arithmetic mean of 4th and 5th
person=(1500+1550)/2 =1525
 That is for even numbers, find the value for numbers n/2,
(n/2)+1, and then take their mean.
Median
 For grouped data,
 Median = L + {(N/2 – pcf)/f} * i
 Where L=lower limit of median class
 pcf=preceding cumulative frequency to the median class
 f=frequency of median class
 i=size of median class
 Median is not affected by extreme values.
 Find the median age:age distribution of workers
Age(years) No.of Age(years) No.of Age(years) No.of Age(years) No.of

< 25 120 30-35 180 40-45 150 50-55 100


25-30 125 35-40 160 45-50 140 >55 25
Quartiles
 Quartiles are related positional measures of central tendency
useful for non central location:example, quartiles, deciles,
percentiles
 Quartiles: divide data into four equal parts. Three points
divide distribution into 4 equal parts.
 Q1=first quartile, 25% observations
 Q2=median, 50% level
 Q3=third quartile, 75% level
 Qj = L + {(jN/4 – pcf)/f} * i, for j-1,2,3
 Where L=lower limit of quartile class
 pcf=preceding cumulative frequency to the quartile class
 f=frequency of quartile class
 i=size of quartile class
Deciles
 Deciles: divide data into ten equal parts
 Dk = L + {(kN/10 – pcf)/f} * i, for k-1,2,3……9
 Where L=lower limit of decile class
 pcf=preceding cumulative frequency to the decile class
 f=frequency of decile class
 i=size of decile class

 Percentiles: divide data into hundred equal parts


 Pl = L + {(lN/10 0– pcf)/f} * i, for l-1,2,3……99
 Where L=lower limit of percentile class
 pcf=preceding cumulative frequency to the percentile class
 f=frequency of percentile class
 i=size of percentile class
Quartiles
 Cumulative frequency distribution

100 99 percentile
90
80 Q3
Cumulative 70 8th decile
frequency 60 Median
50
40
Q1
30
20
10
0

Profits (Rs lakhs)


Mode
 Mode is the value which occurs most often or with the
greatest frequency.
 Mode= L + {d1/(d1+d2)} * i
 d1=difference between frequency of modal class and
frequency of preceding class
 d2=difference between frequency of modal class and
frequency of succeeding class.
 Mode is the most representative value of the
distribution.For example, we talk about modal size of shoe
or garment.
Mode
 In symmetrical distribution,
 Mean=Median=Mode
 In skewed distribution,
 Mean-Median=1/3(Mean-Mode)
 Or Mode=3median-2mean
Mode
Mode
Median
Median
Mean Mean
Measures of Variation and Skewness
 A measure of variation describes the spread of individual
values around central value. For example, readings of
temperature:
Furnace 1 Furnace 2 Furnace 3
500 deg c 450 deg C 200 deg C
500 deg C %00 deg C 300 deg C
500 deg C 550 deg C 400 deg C
Mean =500 deg C Mean = 500 deg C Mean = 500 deg C
 These data have same measure of central tendency but different
variation.
 Four measures of variation:
– Range
– Average or mean absolute deviation
– Quartile deviation
– Standard deviation
Measures of Variation and Skewness
 Range = H-L
 In above example, R1=0, R2=100, R3=700.
 For grouped data, range may be approximated as
difference between upper limit of largest class and lowest
limit of smallest class.
 Coefficient of Range = (H-L)/(H+L)
 Mean Absolute Deviation
 MAD= (∑|x-x|)/N for ungrouped data
 = (∑f|x-x|)/N for grouped data

 Coefficient of MAD=MAD/Mean
Measures of Variation and Skewness
 Quartile Deviation = (Q3-Q1)/2
 Coefficient of Q.D=(Q3-Q1)/(Q3+Q1)
 This is superior as it is based on middle 50% observations
and ignores the extreme values. It is the only measure for
open ended distribution.
 <Exercise> A survey of domestic consumption of electricity gave
following distribution of units consumed. Compute quartile deviation
and its coefficient.
No. of units No. of No. of units No. of consumers
consumers
Below 200 9 800-1000 45
200-400 18 1000-1200 38
400-600 27 1200-1400 20
600-800 32 1400& above 11
Measures of Variation and Skewness
 Standard deviation

 σ = √ {∑(x-x)2}/N

 Variance = σ 2

 For grouped data, σ = √ {∑f(x-x)2}/N

 For convenience in calculation,

 σ = √ {{∑fd2}/N} - {{∑fd/N}2} * i

 Where d=(x-A)/i, A being assumed mean and i the class interval


Measures of Variation and Skewness

 Coefficient of variation = (σ /x)*100


 The lower the coefficient of variation, more consistent is
the data.
 <Exercise>A factory produces two types of electric lamps
A and B. In an experiment relating to life, following
results were obtained:
Life hours Lamp A Lamp B Life hours Lamp A Lamp B
No. of No. of No. of No. of
500-700 5 4 1100-1300 10 8
700-900 11 30 1300-1500 8 6
900-1100 26 8

 Which lamp will you prefer?


Measures of Variation and Skewness
 Skewness: Two distributions may have same mean
and standard deviation but may differ in shape of
distribution.
 Relative skewness S.K=(Mean-Mode)/S.D
 Or S.K =3(Mean-Median)/S.D
 If S.K =0, distribution symmetrical
 = +ve, positively skewed distribution
 = -ve, negatively skewed distribution
 S.K= coefficient of skewness given by Karl
Pearson.
Measures of Variation and Skewness
 For open ended distribution, coefficient of
skewness given by Bowley
 = (Q3+Q1-2Median)/(Q3-Q1)
Positively
skewed Negatively skewed
distribution distribution (longer
Mean>Median> tail towards the lower
Mode value)
Mean<Median<Mode

Symmetrical
distribution
1.Distribution of travelling allowance to salesmen. Compute
coefficient of skewness and comment on its value:
T.A Rs No. of T.A Rs No. of
1000-1200 14 1800-2000 15
1200-1400 16 2000-2200 7
1400-1600 20 2200-2400 6
1600-1800 18 2400-2600 4

2.Compute coefficient of skewness and comment on its value.


Monthly wage No. of persons Monthly wage No. of persons
Below 6000 10 8000-9000 20
6000-7000 25 9000-10000 15
7000-8000 45 10000 & above 5

You might also like