You are on page 1of 21

An Introduction to Statistics

Statistics:

• The subject concerned with scientific method for collecting, summarising,


presenting, and analysing data as well as drawing conclusions or making
predictions on the basis of such analysis.

Descriptive statistics:

• The branch of statistics, which seeks only to describe and analyse any
data is called descriptive statistics.

Inferential statistics:

• The branch of statistics dealing with drawing conclusions about the


population with the help of the analysis of a sample, drawn from it, is
known as inferential statistics.

Classification and tabulation:

• Classification is the first step in tabulation. Classification implies bringing


together the items which are similar in some respect(s).

• Example: students of a class may be grouped together with respect to


their obtained in an examination, their age or area of specialisation, etc.

• After classification, tabulation is done to condense the data in a compact


form which can be easily comprehended.

Diagrammatic / Graphical presentation:

• There are several diagrams/graphs used for presentation of data.

 Bar chart

 Pareto chart

 Pie chart

 Histogram

 Ogive

 Line graph

 Lorenz curve.
(i) Bar chart:

• It comprises a series of bars of equal width- the base of the bars


being equal to the width of the class interval of a grouped data.
The bars stand on a common base line, the heights of the bars
being proportional to the frequency of the interval.

• The following data give the distribution of 215 MBA students at a


management institute according to educational qualifications.

Educational No of
Qualification students
B.Tech 55
B.Com 70
B.Sc 25
B.A 45
C.A 20

(a)Sub divided bar chart:

• A subdivided bar chart is a bar chart wherein each bar is


divided into further components.
• In the above example if the information about the cities from
where the students have graduated, is also available as given
below.

Educational Metr Larg Mediu No of


Qualification o e m students
B.Tech 15 25 15 55
B.Com 35 20 15 70
B.Sc 10 10 5 25
B.A 15 10 20 45
C.A 10 5 5 20

(b)Percentage bar chart:

• Percentage bar chart is one in which each bar is divided into


components which are expressed as percentage of the total
bar.

Average Net Percentage of profit to


Average Sales Profit sales
Automaker Estimates Estimates (iii)=(ii)/(i) *100
Tata motors 6848.8 466 7.2
Hero Honda 2196.5 224.2 10
Bajaj Auto 2444.7 345.4 14
TVS Motor 1032.9 35.1 3.4
Bharath Forge 461.6 63.4 14
Ashok Leyland 1635.8 94.7 5.8
M&M 2365.5 200.6 8.5
Marutiudyog 3426.5 315.7 9.2
(c) Multiple bar chart:

• Multiple bar charts are one in which two or more bars are
placed together for each entity.

• The bars are placed together to give comparative assessment


of values of some parameter over two periods of time or two
different locations etc.

Pain Killer 2005 2006


Voveran 16.5 23.2
Calpol 13.2 18.2
Nise 15.2 18.6
Combiflam 9.4 14.1
Dolonex 6.8 10.3
Sumo 5.1 7.4
Volini 6.9 9.6
Moov 3.8 4.9
Nimulid 3.5 4.9
Another example…

Net worth in Net worth in


$ Billion March $ Billion March
Name 06 07
Lakshmi Mittal 20 32
MukeshAmbani 7 20.1
Anil Ambani 5.5 18.2
AzimPremij 11 17.1
Kushal Pal Singh 5 10
Sunil Mittaal& Family 4.9 9.5
Kumar Mangalam Birla 4.4 8
Shashi& Ravi Ruia 2.7 8
PallonjiMistry 3.3 5.6
Adi Godrej & Family 2.3 4.1
Shiv Nadar 3 4
Anil Agarwal 2.1 3.8
DilipShanghvi 2 3.1
Tulsi Tanti 2.4 3.7
Malvinder&Shivinder Singh 2 1.55
VenugopalDhoot 1.6 1.6
Naresh Goyal 1.3 1.9
Rahul Bajaj 1.1 1.5
(ii)Pareto chart:

• This specialist bar chart, named after the famous Italian


economist, is used to classify a variable into groups or intervals
from largest to smallest frequency.

• It facilitates identification of the most frequent occurrence or


causes of an event or phenomenon. It is used for sorting by data
by using any criteria like geographical regions, organisation like
management institutes, banks, countries, cities etc.

Academic Background Frequency


Commerce 18
Economics 6
Eingineering 17
Information
Technology 7
Science 8
(iii) Pie chart:

• It is one of the most popular charts for presenting the whole into
parts. It is a circular chart divided into sectors representing
relative magnitude of various components.

• A pie chart is obtained by dividing a circle into sectors such that


these sectors have areas or centre angles proportional to
different components given in the data.

Percentage of Percentage of
Sources of Funds Total Uses of Funds Total
Excise 17 Central Plan 20
Non-plan Assistance and
Customs 12 Expenditure 23
Corporate Tax 21 Defence 12
Income Tax 13 Interest Payments 20
Service Tax 7 states' Share 18
Borrowings &
others 30 Subsidies 7
Total 100 100

percentage of size of Segment


sources of Funds Total (Degrees)
Excise 17 61.2
Customs 12 43.2
Corporate Tax 21 75.6
Income Tax 13 46.8
Service Tax 7 25.2
Borrowings &
Others 30 108
Total 100 360
percentage of size of
uses of funds total segment
Central Plan 20 72
Non-plan Assistance and
Expenditure 23 82.8
Defence 12 43.2
Interest Payments 20 72
states' Share 18 64.8
Subsidies 7 25.2
100 360

(iv) Histogram / Frequency polygon:

• A histogram comprises of vertical rectangles whose base is


proportional to the class interval and height is proportional to
the frequency of an interval.

• The polygon formed by joining the top middle points of the


rectangles of the histogram s called frequency polygon.

(v)Line graphs:

• A line graph is a visual presentation of a set of data values


joined by straight lines.

Business Per Employee


Bank 2005-06 Business Per Employee 2001-02
Allahabad Bank 336 153
Andhra Bank 426.75 195.96
Bank of Baroda 396 222.76
Bank of India 381 218.74
Bank of
Maharashtra 306.18 191.44
Canara Bank 441.57 214.88
Central Bank of
India 240.46 148.77
Corporation Bank 527 290.44
Dena Bank 364 221
Indian Bank 295 156
Indian Overseas
Bank 354.73 175.41

(vi) Lorenz curve:

• Indicates the extent of inequality in the distribution of a financial


parameter like income
Descriptive statistics

• The branch of statistics, which seeks only to describe and analyse any
data is called descriptive statistics.

Measures of central tendency:

1) Arithmetic mean

2) Median

3) Mode

4) Geometric mean

5) Harmonic mean

Arithmetic mean:

An average is a single value within the range of the data that is used to
represent all of the values in the series.

“Arithmeticmean is quotient of sum of the given values and number of


the given values”.

Arithmetic mean: Problems for Practice

1) Find the arithmetic mean of the marks obtained by 10 students of class X


in mathematics in a certain examination. The marks obtained are

25,30,21,55,47,10,15,17,45,35

Ans=30.

2) Find the Arithmetic Mean from the following frequency table:

Marks 52 58 60 65 68 70 75
No of 7 5 4 6 3 3 2
Student
s

Ans= 61.6
3) The following table gives the distribution of 100 accidents in New Delhi
during seven days of a week of a given month. During that month there
were 5 Mondays, 5 Tuesdays and 5 Wednesday s and only four each for
the other days. Calculate the number of accidents per day.

Day: Sunday Monday Tuesda Wednesd Thursda Friday Saturda


y ay y y
No of 26 16 12 10 8 10 18
Acciden
ts:

Ans= 14.13

4) The data on number of patients attending a hospital in a month are given


below. Find the average number of patients attending the hospital in a
day.

Number 0-10 10-20 20-30 30-40 40-50 50-60


of
patients
Number 2 6 9 7 4 2
days
attending
the
hospital

Ans=28.67

5) Ten coins were tossed together and the number of the resulting from them
was observed. The operation was performed 1050 times and the
frequencies thus obtained for different number of tails (x) are shown in the
following table. Calculate the arithmetic mean by the shortcut method.

X: 0 1 2 3 4 5 6 7 8 9 10
Y: 2 8 43 133 207 260 213 120 54 9 1

Ans=5.0114

6) For the following frequency table, find the mean.

Class: 100-120 120-140 140-160 160-180 180-200 200-220 220-240


Frequen 10 8 4 4 3 1 2
cy

Ans=145.625

7) In a study on patients, the following data were obtained. Find the


arithmetic mean.

Age (in 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89
years)
No of 1 0 1 10 17 38 9 3
cases:

Ans=60.7

8) Find the value of p for the following distribution whose mean is 16.6

F: 12 16 20 24 16 8 4
X: 8 12 15 P 20 25 30

Ans=16.6

9) The mean height of 25 male workers in a factory is 61 inches and the


mean height of 35 female workers in the same factory is 58 inches. Find
the combined mean height of 60 workers in a factory.
Ans=59.25

10) A firm of readymade garments make both men’s and women’s


shirts. Its profit average is 6% of sales. Its profits in men’s shirts average
8% of sales; and women’s shirts comprise 60% of output. What is the
average profit per sales rupee in women’s shirts? Ans= 4.67

11) The average score of girls in class X examination in a school is 67


and that of boys is 63. The average score for the whole class is 64.5 find
the percentage of girls and boys in the class. 10.
Ans:62.5
12) There are 50 students in a class of which 40 are boys and rest girls.
The average weight of the class is 44 kg and the average weight of the
girls is 40 kg. Find the average weight of the boys.
Ans=45

13) The mean annual salary of all employees in a company is Rs.


25,000. The mean salary of male and female employees is Rs. 27,000 and
Rs. 17,000 respectively. Find the percentage of males and females
employed by the company. Ans: males=80 and females=20.

14) The mean marks of 100 students were found to be 40. Later on it
was discovered that a score f 53 was misread as 83. Find the correct mean
corresponding to the correct score.

Ans=39.7

15) Mean of 100 observations is found to be 40. If at the time of


computation two items are wrongly taken as 30 and 27 instead of 3 an 72.
Find the correct mean. Ans =40.18

Median

Problems for Practice

1) The number of runs scored by 11 players of a cricket team of a school are

5 19 42 11 50 30 21 0 52 36 27 . Find median

Ans=27runs.

2) Find the median of the following items:

6 10 4 3 9 11 22 18

Ans=9.5

3) The following table represents the marks obtained by a batch of 12


students in certain class tests in Statistics and Physics.

sr. no 1 2 3 4 5 6 7 8 9 10 11 12
Marks 53 54 32 30 60 46 28 25 48 72 33 65
(Statisti
cs)
Marks 55 41 48 49 27 25 23 20 28 60 43 67
(Physics
)
Ans=42.

4) Calculate median for the following data:

No of 6 4 16 7 8 2
students
Marks: 20 9 25 50 40 80

Ans= 25

5) Find the median of the following frequency distribution:

X: 5 7 9 12 14 17 19 21
Y: 6 5 3 6 5 3 2 4

Ans=12

6) The following table gives the weekly expenditure of 100 families. Find the
median weekly expenditure.

Weekly 0-10 10-20 20-30 39-40 40-50


Expenditure
Number of 14 23 27 21 15
Families

Ans=24.815

7) Calculate the mean and median for the following data:

Height (in cm) No of boys Height (in cm) No. Of boys


135-140 4 155-160 24
140-145 9 160-165 10
145-150 18 165-170 5
150-155 28 170-175 2

Ans=153.9

8) Calculate the median from the following data.

Weight (gms) No of apples Weight (gms) No of apples


410-419 14 450-459 45
420-429 20 460-469 18
430-439 42 470-479 7
440-449 54
Ans=443.94
9) Calculate the median:

Marks No of students Marks No of students


Less than 5 29 Less than 30 644
Less than 10 224 Less than 35 650
Less than 15 465 Less than 40 653
Less than 20 582 Less than 45 655
Less than 25 634
Ans=14.29

Mode

Problems for practice

1) A shoe shop in Delhi had sold 100 pairs of shoes of a particular brand on a
certain day with the following distribution: find the mode of the
distribution.

Size of 4 5 6 7 8 9 10
Shoes
No of pairs: 10 15 20 35 16 3 1
Ans=7

2) Find the mode for the following data:

Marks: 1-5 6-10 11-15 16-20 21-25


No of Students: 7 10 16 32 24
Ans=19.33

3) Calculate Median and Mode for the following distribution:

Production per 21-22 23-24 25-26 27-28 29-30


day (in tons)
No of days: 7 13 22 10 8

Ans= 25.36

4) Calculate AM, median and mode from the following frequency distribution.

Variable Frequency Variable Frequency


10-13 8 25-28 54
13-16 15 28-31 36
16-19 27 31-34 18
19-22 51 34-37 9
22-25 75 37-40 7
(Mean =24.19, median=23.96, mode=23.6)

Measures of dispersion

The degree to which numerical data tend to spread about an average value is
called the variation or dispersion of the data.

Significance of measuring variation:

• To determine the reliability of an average.

• To serve as a basis for the control of the variability.

• To compare two or more series with regard to their variability.

• To facilitate the use of other statistical measures.

Methods of studying variation:

• The range

• The quartile deviation

• The mean deviation

• The standard deviation.

Range

1) The following are the prices of shares of AB Co Ltd from Monday to


Saturday. Calculate range and its coefficient.

Day Price Day price


Monday 200 Thursday 160
Tuesday 210 Friday 220
Wednesday 208 Saturday 250
Ans: range=90 and coefficient of range=0.22
2) Calculate the coefficient of range from the following:

Marks No of students Marks No of students


10-20 8 40-50 8
20-30 10 50-60 4
30-40 12

Ans=0.714

The quartile deviation

1) Find out the value of quartile deviation and its coefficient from the
following data:

Marks 10 20 30 40 50 60
No of 4 7 15 8 7 2
students
Ans: QD=10 and coeff=0.333

2) Calculate quartile deviation and its coefficient from the following


data:

Wages in Less than 35-37 38-40 41-43 Over 43


Rs per 35
week
No of 14 62 99 18 7
wage
earners
Ans: QD=1.67 and coeff=0.044

Mean deviation

1) Calculate the mean deviation and its coefficient of the two income
groups of five and seven members.

1st 4000 4200 4400 4600 4800


group
2nd 3000 4000 4200 4400 4600 4800 5800
group

Ans: 1st: MD=240 coeff=0.054 & 2nd: MD=571.43, coeff=0.130

2) Calculate the mean deviation:

X 10 11 12 13 14
F 3 12 18 12 3
Ans=0.75

3) Calculate mean deviation and its coefficient.

Class frequency Class Frequency


0-10 5 40-50 20
10-20 8 50-60 14
20-30 12 60-70 12
30-40 15 70-80 6
Ans: MD=15.37 & coeff=0.357

Standard deviation

1) Blood serum cholesterol levels of 10 persons are as under


240,260,290,245,255,288,272,263,277,251.

Calculate standard deviation.

2) The annual salaries of a group of employees are given in the following


table.

Salarie 45 50 55 60 65 70 75 80
s in (Rs
000)
Numbe 3 5 8 7 9 7 4 7
r of
person
s
Calculate SD of the salaries. Ans =10.35

3) Calculate mean and SD of the following frequency distribution of marks:

Marks No of students Marks No of students


0-10 5 40-50 50
10-20 12 50-60 37
20-30 30 60-70 21
30-40 45
Ans : mean=40.9 & SD=14.839

Coefficient of variation
1) From the prices of shares of X and Y below find out which is more in value:

X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 105 106 107 104 103 104 101
Ans: CV of X=11.6 & CV of Y=1.905

2) Two brands of tyres are tested with the following results:

Life (in ‘000 miles) No of tyres brand


X Y
20-25 1 0
25-30 22 24
30-35 64 76
35-40 10 0
40-45 3 0

a) Which brands of tyres have greater life?

b) Compare the variability and state which brand of tyres would you use on
your fleet of trucks/

*********************************************************************************
Probability

1. A can solve 80% of the problems, while B can solve 90% of problems in a
Statistics book. A problem is selected at random. What is the probability
that at least one of them will solve it?

2. In a box, there are 2 white and 4 black balls. What is the probability that
both of the two balls drawn, one after the other, are white?

3. In families with two children, what is the probability that a family will have

i. One boy one girl?

ii. Two girls?

iii. Two boys?

In the absence of any other information, it is assumed that the probability


of child being a boy or a girl is ½ .

4. A speaks the truth in 60% and B in 75% of the cases. In what percentage
of the cases, they are likely to contradict each other stating the same
fact?

5. An investment consultant predicts that the odds against the price of a


certain stock going up are 2:1, and odds in favour of the price remaining
the same are 1:3. What is the probability that the stock will go down?

6. The probability that A can solve a problem in Statistics is ½ , that B can


solve 1/3, and C can solve it is 1/5. If all of them try independently, then
find the probability that the problem will be solved.

7. A salesman is known to sell a product in 3 out of 5 attempts while another


salesman in 2 out of 3 attempts. Find the probability that

i. No sale will take place when they both try to sell the product

ii. Either of them will succeed in selling the product.


8. An investment analyst presents the following table giving probabilities of
next year’s economic conditions normal or good or very food, in the
country, and probabilities of the movement increase or decline.

9. A class consists of 100 students; 25 of them are girls and 75 boys; 80 0f


them are rich and 20 are poor; 40 of them have brown eyes and 60 have
black eyes. What is the probability of selecting a brown eyed rich girl?

10.A candidate is selected for interviews for 3 posts. For the first post, there
are 3 candidates are second, 4 and for the third post there are 2
candidates. What is the probability that the candidate is selected for at
least one post?

11.Three machines producing 40%, 35% and 25% of the total output are
known to produce with defective proportion of items as: o.04, 0.06 and
0.03, respectively. On a particular day, a unit of output is selected at
random, and is found to be defective. What is the probability that it was
produced by the second machine?

12.In a basin area where oil is likely to be found underneath the surface,
there are three locations with three different types of earth composition,
say C1, C2 and C3.

You might also like