You are on page 1of 12

Business Analytics Introduction to Analytics and Data

Pristine

Pristine www.edupristine.com

2.c. Case: Summarizing Data


Romanov, an Analytics consultant works with Credit One bank. His manager gave him some data around credit cards relating to number of credit cards issued to a set of customers and the credit limit of the cards. Further he has been tasked to summarize the data in a presentable form and prepare the report. Romanov, who has just started his professional career, has never played around with such kind of data, so he is clueless about the different summarizing techniques. Now, suppose he approached you and asked your help in preparing the report. Help Romanov in summarizing the data and preparing the report.

Pristine

2.c. Comments: Summarizing Data


There are various ways to summarize data. Some of them are
1. Frequency distribution

2.
3. 4. 5.

Grouped frequency distribution


Cumulative frequency distribution Stem leaf diagram Line plots

Pristine

2.c. Summarizing Data - Frequency distribution


A technique to summarize discrete data A simple process which involves counting of distinct discrete values

The representation can be either tabular or graphical


Example: Number of credit cards owned in a sample of 3000 individuals
Tabular representation
Number of Credit Cards 1 2 3 4 5 6 # Customers
700

Graphical representation - Bar Chart


Freq Distribution- #Cards vs. # Customers

150
600

300
# Customers

500 400 300 200 100 0 1 2 3 4 5 6 7

# Customers

450 660 540 300

7
8 9 10 Pristine

240
150 120 90

10

# Cards

2.c. Summarizing Data - Frequency distribution (Using MS Excel)


1 2 3
Number of Credit Cards 3 2 4 5 1 7 9 10 6 8

4. Press ctrl+alt+enter

# Customers
700 600

500
400 300 # Customers

200
100 0 1 2 3 4 5 6 7 8 9 10

Pristine

2.c. Summarizing Data - Grouped Frequency distribution


A technique to summarize continuous data or discrete data having large number of observations and an extended range

A simple process which involves counting of values falling under the different intervals (grouped)
Example and illustration 2.2: Number of customers falling under different Salary groups
Graphical representation - Bar Chart
Freq Distribution- Salary Band vs. # Customers
120 100

#Customers

80 60 40 20 0

Salary Band

Pristine

2.c. Summarizing Data Grouped Frequency distribution (Using MS Excel)


1 2

1. Press ctrl+alt+enter

4
5.Observe the difference between horizontal axes of two charts

5
# Customers
120 100 80 60 40 20 0 0-75000

4.From Edit select the salary bands as horizontal axis

200001-225000

100001-125000

150001-175000

250001-275000

300001-325000

350001-375000

400001-425000

450001-475000

500001-525000

550001-575000

600001-625000

650001-675000

700001-725000

750001-775000

800001-825000

850001-875000

900001-925000

950001-975000

Pristine

2.c. Summarizing Data - Cumulative Frequency distribution


Cumulative frequencies are obtained by accumulating the frequencies to give the total number of observations up to and including the value or group in question.

Example and illustration 2.3: Cumulative number of cards in the sample of 3000 individuals
Tabular representation Number of Credit Cards Up to
1 2 3 4 5 6 7 8 9 10

Graphical representation

Cumulative # Customers
Cumulative # Customers
150 450 900 1560 2100 2400 2640 2790 2910 3000

Cumulative # Customers
3000

2500
2000 1500 1000 500 0 0 1 2 3 4 5 6 7 8 9 10

# Cards

Pristine

2.c. Summarizing Data - Cumulative Frequency distribution (Using MS Excel)


1 2

5
Cumulative # Customers
3500 3000 2500 2000 1500 1000 500 0 0 2 4 6 8 10 12

Pristine

3. Observe the last entry. It is equal to the total numbers of observations

2.c. Summarizing Data Stem-leaf diagram


Stem-leaf diagram
Not suitable for large data. Hence, not extensively used in industry. Illustration: Given age of 20 individuals in years. Represent them using stem-leaf diagram

Sl #
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Age
23 33 23 33 34 21 54 52 34 36 52 51 48 35 40 43 49 54 27 39

Age (Sorted)
21 23 24 27 30 31 33 34 35 36 39 40 43 48 49 51 52 53 54 57

Stem 20

Leaf 1 3 4 7

30

1 3 4 5 6 9

40

0 3 8 9

50

1 2 3 4 7

Pristine

2.c. Summarizing Data Line Plots


Line plot diagram
Not suitable for large data. Hence, not extensively used in industry. Illustration: Given test scores of 20 students. Represent them using line plot diagram
Sl #
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pristine

Score
50 20 50 50 50 30 30 40 30 40 30 20 50 40 20 30 40 40 50 50

Score (Sorted)
20 20 20 30 30 30 30 30 40 40 40 40 40 50 50 50 50 50 50 50 10

Thank you!

Pristine 702, Raaj Chambers, Old Nagardas Road, Andheri (E), Mumbai-400 069. INDIA

www.edupristine.com
Ph. +91 22 3215 6191
Pristine Pristine www.edupristine.com 11

You might also like