Professional Documents
Culture Documents
STEP 1: Examine the data to decide how many class intervals you need and what the
class boundaries should be. (In an assignment you may be told what class boundaries
to use and so you can skip this step.)
To illustrate the process the excel file House Sales in Kaleen 2003 will be used. A
portion of the file is shown below and was obtained from www.allhomes.com.au.
The histogram is to be drawn on the variable Price. The file is conveniently ordered
from smallest to largest and so it can easily be seen that there are 134 values (there are
136 lines of data but the first two are headings) with the minimum value $142,500 and
the maximum value $650,000. If your file is not ordered see page 7 for instructions on
what to do.
How many class intervals should there be?
A rough estimate is the square root of the number of observations. 134 is
approximately 11.6 and so somewhere around 10-12 class intervals would be suitable.
(Remember that a histogram is trying to capture the shape of the distribution. Too few
classes and the shape is lost. Too many classes and the shape is lost by the random
fluctuations of the data.) Look at the Excel worksheet Different Numbers of Class
Intervals in the Excel file House Sales in Kaleen 2003 for examples of histograms
with different numbers of class intervals.
What should the class boundaries be?
Where possible the class boundaries should be of equal width and should be natural. If
we choose the stated lower boundary to be $100,000 and the class width to be $50,000
then there are 12 classes. This seems a fairly natural choice but others are possible.
Bin value
99,999.5
149,999.5
199,999.5
249,999.5
299,999.5
349,999.5
399,999.5
449,999.5
499,999.5
549,999.5
599,999.5
649,999.5
699,999.5
Having clicked OK, the following dialogue box should appear. I have filled it in for
the house price example using the worksheet labeled Histogram. The Input Range and
Bin Range can be filled in by using the cursor to highlight the required cells on the
Excel spreadsheet. Notice that I have ticked the box marked Labels because I have
included the labels in the data items. I prefer to place the histogram on the same
worksheet as the data and so I have selected Output Range and specified where I want
the output to go. Some people prefer to have the histogram on a new worksheet. This
is the default. Dont forget to tick Chart Output otherwise Excel will not draw the
histogram.
Once you have filled in the dialogue box and clicked OK you should get output
similar to that shown over the page.
3
The histogram produced unfortunately leaves a lot to be desired and requires quite a
bit of editing before it is acceptable. This is detailed in Step 4.
Remove the box on the right hand side that says Frequency.
Click on the box and then press the Delete button.
Click OK.
Include a source.
The primary source of the data should be shown in the bottom left hand corner of
the histogram. Click in the histogram box to make the black squares appear. Type
what you want for the source. Nothing will appear until you press Enter and a text
box appears in the centre of your histogram containing what you typed. Use the
cursor to move it to the bottom left hand corner. Change the font size to 8 point.
You may be asked to put your id number in the top right hand corner of the
histogram. This can also be done using a text box and should be placed inside
Excels chart area.
For the dialogue box in the middle select the variable you want sorted. When I
selected the data I included the row with the headings. (Notice that Excel has
shown that the header row is selected.) If you dont select the header row the
variable names will not appear just the Excel columns. Click OK and the data will
be sorted.
OR
Use the descriptive statistics tool to find the number of observations , the
minimum and maximum values.
Select Tools > Data Analysis > Descriptive Statistics and the following dialogue
box should appear. Highlight the column that you want descriptive statistics for
and the input range should be filled in. Tick Summary statistics and decide where
you want the output to go. The descriptive statistics are in a work sheet of the
same name in the excel file House Sales in Kaleen 2003.