You are on page 1of 27

DE Biology 101 Lab Investigation:

Working With Data


Objectives

- Recognize questions that can, and cant, be addressed with science.


- Become familiar with the main stages of the Scientific Method.
- Become familiar with the Metric System of Measurement.
- Develop a sampling protocol and use it to collect raw data.
- Understand the purpose of Descriptive Statistics, and learn how to calculate three basic
kinds: the Sample Size, a Measure of Central Tendency, and a Measure of Variability.
- Become familiar with Graphics Software, and learn how to summarize Descriptive
Statistics with a Figure.
- Use a Figure to determine whether groups are meaningfully different from one another.

Exercise 1: Science Requires Physical Evidence


Science is a LOGICAL PROCESS for investigating reality. You use science every day.
However, much of the process is unconscious. Scientists have been trained to become
aware of the scientific process that goes on in their minds. One function of this course is
to help you become more aware of the science that you use every day.

Science uses PHYSICAL EVIDENCE to answer questions about reality. If you cant
gather physical evidence, you cant do science. For example, questions of opinion cant
be examined with science. They rely on INTERNAL convictions, not EXTERNAL
evidence. You can recognize opinions by certain key words, such as good, bad, right,
wrong, better, worse. Questions that contain these words cant be examined with science.
For example, is the color blue BETTER than the color red?

Some questions cant be examined with science, because we dont know what physical
evidence to look for. For example, have aliens visited Earth? Since we have no idea what
aliens might look like, how they would act, or what kinds of technology they might have,
theres no way to know what sorts of evidence they would leave behind.

Questions of faith also fall into this category. For example, is there a divine force who
answers our prayers? What would the physical evidence look like? Certainly, some
prayers seem to be answered, because they happen. But, there is no physical evidence
that can tell us whether a DIVINE FORCE answered them, or whether they happened for
some OTHER reason. And, what about prayers that dont come true? Faith can handle
such questions by asserting that a divine force works in mysterious ways. Science
requires objective, physical, evidence.

Page 1 of 27
Its also important to understand the perspective of science when dealing with physical
evidence that has NO current explanation. Physical evidence that has no explanation
doesnt support ANY explanation of reality. Its simply unexplained, and will remain in
limbo until an explanation is found.

For example, if you saw strange lights in the sky that didnt match any aircraft, you might
be tempted to say space aliens! A scientist would say I dont know what it is yet!
Similarly, if a patient near death from cancer recovers completely, many people might
decide this is a miracle produced by a divine force. A scientist would conclude, I dont
know why the person recovered yet.

Finally, it is important to understand that science is NOT attempting to prove there is no


divine force. Some scientists try this, but its a misuse of the scientific process. Science is
simply attempting to understand how reality functions, based on physical evidence.

Science and religion can clash, and this usually happens when physical evidence
suggests ONE interpretation of reality, while literal interpretations of religious texts
suggest a DIFFERENT interpretation. In such cases, scientists generally feel that strong
physical evidence provides a more accurate description of reality than literal
interpretations of religious texts. What decision should YOU make in such cases? Thats
up to you.

Practice Questions:
Examine the following questions, and determine whether science can be used to
investigate them or not. You arent being asked whether or not the questions ARE TRUE,
only whether science can be used to investigate them. Check your answers below.

1) Can dogs see the color red?


2) Did space aliens help the Egyptians build the pyramids?
3) Does DNA play a role in intelligence?
4) Was Nevada once covered by oceans?
5) Is country music worse than hip-hop music?

Answers to Practice Questions:


1) Yes
2) No: what sort of evidence would we look for?
3) Yes, so long as the concept of "intelligence" is rigorously defined
4) Yes
5) No: Whether something is "worse" is an opinion

Workbook Questions:
Examine the following questions, and determine whether science can be used to
investigate them or not. You arent being asked whether or not the questions ARE TRUE,
only whether science can be used to investigate them. Answer YES or NO in your
Workbook. You don't need to explain your answers.

Page 2 of 27
1) Does life exist on Mars?
2) Is Classical music better than Rock?
3) Do angels exist?
4) Is affirmative action a good idea?
5) Does affirmative action result in a greater percentage of minority graduates?

Exercise 2: Stages of the Scientific Method


There is no single scientific process. However, there are some basic actions that are
usually conducted during scientific explorations. These actions are generally summarized
in a series of broad stages called the SCIENTIFIC METHOD. Throughout this semester,
you will investigate these stages in more detail. For now, how about an introduction?

Table 1. Stages of the scientific method

Stage Actions Conducted


1) Observe Observe an interesting aspect of reality. Raw data may be collected as part
of your observations.

2) Hypothesize Develop one or more hypotheses (potential explanations) to explain your


observations.

3) Test Develop tests that evaluate your hypotheses to determine which, if any of
them, seem to explain your observations. During testing, new physical
evidence is collected and analyzed.

4) Conclude There are two possible conclusions for a hypothesis test:

(1) If the physical evidence supports a particular hypothesis, then


conclude that the hypothesis is the best explanation for your
observations of reality, at least for now.

(2) If the physical evidence does NOT support a particular hypothesis,


then conclude that the hypothesis does NOT explain your
observations of reality.

For example, suppose you wake up one morning and smell something terrible. You think
that perhaps you forgot to take the trash out the night before. You check the trash bin,
and sure enough, it's still full. You take out the trash, and the smell goes away. Problem
solved! Though you probably arent aware of it, you have just conducted the scientific
method. Lets take a closer look at what you just did.

Practice Questions:
The trash example has been broken into four discrete stages below. Each one represents
part of the scientific method. Label each stage with its appropriate name (Observe,
Hypothesize, Test, or Conclude). Note that the stages arent in order. That would be too
easy. Check your answers below.

Page 3 of 27
1) You discover the full trash bin, and get rid of it.
2) Perhaps you forgot to take out the trash.
3) Because the smell went away, this probably means that the trash was the problem, as
you thought.
4) You smell something terrible.

Answers to Practice Questions:


1) Test
2) Hypothesize
3) Conclude
4) Observe

Workbook Questions:
Suppose you need a flashlight. You find one, but when you turn it on, it doesn't work. You
think that the batteries might be dead. You put new batteries in, and the flashlight works.

This example has been broken into four discrete stages below. Each one represents part
of the scientific method. Label each stage with its appropriate name (Observe,
Hypothesize, Test, or Conclude). Record your answers in your Workbook. Note that the
stages arent in order. That would be too easy.

1) Because the flashlight turned on with new batteries in it, this probably means that the
original batteries were, in fact, dead, as you thought.
2) You switch on the flashlight, and discover it doesnt work.
3) You insert new batteries, and switch it on. The flashlight works.
4) Perhaps the batteries are dead.

Exercise 3: The Metric System: Base Units


Imagine you are planning a cookout. You know the guests include 5, 7, and 4. The
cookout will begin on 2453898.125. At the grocery store, you buy 9.07 hamburger, and
pay 5000.

Confused? The numbers dont make sense, do they? The numbers are confusing
because there are no UNITS OF MEASUREMENT. Numerical data dont make any sense
unless they have units. Read the paragraph again, and see if it makes more sense with
the units of measurement added:

Imagine that you are planning a cookout. You know the guests include 5 married couples,
7 single friends, and 4 families with kids. The cookout will begin on the Julian date of
2453898.125. At the grocery store, you buy 9.07 kilograms of hamburger, and pay 5000
pennies for it. Now this makes more sense. (If youve never heard of the Julian date, you
can find out more at http://aa.usno.navy.mil/data/docs/JulianDate.html).

Page 4 of 27
To understand numerical data, you always need to know the units of measurement. The
METRIC SYSTEM is a group of units that is the standard one used by scientists around
the world. The metric system is the only one well use in this course, so lets become
familiar with it.

The metric system has a BASE UNIT for each different KIND of numerical data, such as
length, volume, or duration. The common base units are shown in Table 2. Note that each
base unit has an abbreviation that is NEVER capitalized.

Table 2. Common metric system base units


Kind of Data
Base Unit Abbreviation Measured Example
gram g Mass (essentially the One gram is about the weight of a
same as Weight) paperclip
meter m Length One meter is a little bit longer than a
yard
liter l Volume Soft drinks are often sold in two liter
bottles
second s Duration Olympic ski events are measured in
seconds
byte b Computer Storage One byte corresponds to one letter or
Capacity number stored in a computer

Practice Questions:
Use Table 2 to answer the following questions. Check your answers below.

1) If I had a liter of cornflakes, am I measuring its volume or its mass?


2) If I had a gram or cornflakes, am I measuring its volume or its mass?
3) You see a measurement of 320s. What kind of data is this?

Answers to Practice Questions:


1) volume
2) mass
3) duration

Workbook Questions:
Use Table 2 to answer the following questions. Record your answers in your Workbook.

1) What kind of data does a gram measure: length, mass, or duration?


2) Which makes more sense; a liter of water, or a meter of water?
3) If you wanted to measure computer storage capacity, would you use seconds, liters, or
bytes?

Page 5 of 27
Exercise 4: The Metric System: Prefixes
To indicate more, or less, than the base unit, special prefixes are added to it. These
prefixes always indicate some FACTOR OF TEN. The common prefixes are shown in
Table 3. Note that SOME are capitalized, while others are not.

Table 3. Common metric prefixes and their associated numerical values. The prefix
called micro is the letter mu from the Greek alphabet.
Prefix Prefix Name Numerical Value Associated With Prefix
G giga one billion of the base unit 1,000,000,000
M mega one million of the base unit 1,000,000
k kilo one thousand of the base unit 1,000
h hecto one hundred of the base unit 100
dag deka ten of the base unit 10
Base Unit without a prefix one of the base unit 1
(gram, meter, liter, second, or byte)
d deci one tenth of the base unit 0.1
c centi one hundredth of the base unit 0.01
m milli one thousandth of the base unit 0.001
micro one millionth of the base unit 0.000001
n nano one billionth of the base unit 0.000000001

Practice Questions:
Use Tables 2 and 3 to answer the following questions. Check your answers below.

1) How should you abbreviate two megagrams?


2) How many seconds are in one hectosecond?
3) What is the metric abbreviation for five thousand meters?
4) Which is a smaller volume; 200hl or 200dl?

Answers to Practice Questions:


1) 2 Mg
2) one hundred
3) 5 km
4) 200 dl

Workbook Questions:
Use Tables 2 and 3 to answer the following questions. Record your answers in your
Workbook.

1) How should you abbreviate two milligrams: 2 Mg or 2 mg?


2) What is the metric abbreviation for five one-hundredths of a meter: 5 cm or 5 mm?
3) One computer file is 3 kb in size, the other is 3Gb. Which is larger?
4) Which would be worth more money; 1 g of gold, or 1 dag of gold?
5) Does nanotechnology deal with items that are very large, or very small?

Page 6 of 27
Exercise 5: Rounding
Hopefully, you remember from grade school that each position in a number has a
particular "place name". For example, in the number 123.45, the number 1 is in the
hundreds place, 2 is in the tens place, 4 is in the tenths place, and 5 is in the hundredths
place. The number 3 is in the "first whole numbers" place.

Often, numbers need to be ROUNDED to specific place values. For example, suppose
you weigh an item on a scale, and the scale reports that it weighs 52.041 grams. Should
you record the weight as 52.041, or should you ignore some of the numbers, and record
the weight as just 52, or 52.04? Or, suppose you use a calculator to determine that 5
divided by 3 is 1.666666, repeating to infinity. Should you record the answer as
1.66666666, or should you ignore some of the repeating sixes? If so, how many sixes
should you ignore?

Ignoring certain numbers in measurements and calculations is called ROUNDING.


Determining what numbers to ignore can be rather complicated. In the labs for this
course, you will generally be TOLD what place value to round your answers to. If you
aren't told, then always round fractional numbers to TENTHS. If numbers aren't fractional,
no rounding is necessary. Fractional numbers are those that have something other than
zero in the tenths, hundredths, thousandths, etc, places.

In grade school, you should have learned one important RULE OF ROUNDING:

- Look at the number to the right of the number you are rounding to. If that number is 5 or
greater, round your number UP one. If the number to the right is 4 or less, do not change
your number.

Practice Questions:
Answer the following questions. Check your answers below.

1) Round 34.123 to tenths.


2) Round 34.153 to tenths.
3) Round 34.153 to the nearest whole number.
4) Round 34.153 to tens.
5) Round 5.99 to tenths.
6) In one of the labs for this course, you make a calculation, and the result is 86.754. The
instructions don't indicate what number to round to. How should you report your answer?
7) In one of the labs for this course, you make a calculation, and the result is 205. The
instructions don't indicate what number to round to. How should you report your answer?

Answers to Practice Questions:


1) 34.1
2) 34.2
3) 34

Page 7 of 27
4) 30
5) 6.0
6) 86.8. As stated earlier in this Exercise, if your instructions don't indicate what to round
to, you should round fractional numbers to TENTHS.
7) 205. As stated earlier in this Exercise, if your instructions don't indicate what to round
to, you don't round non-fractional numbers.

Workbook Questions:
Answer the following questions. Record your answers in your Workbook.

1) Round 65.152 to tenths.


2) Round 65.152 to tens.
3) Round 65.152 to hundreds.
4) Round 65.152 to hundredths.
5) In one of the labs for this course, you make a calculation, and the result is 1.98. The
instructions don't indicate what number to round to. How should you report your answer?

Exercise 6: Sampling Protocols


Science is a transparent process. This means that scientists create detailed lists that
explain EXACTLY how they take their measurements, and these lists are available for
others to see.

A detailed list that explains exactly how measurements are taken is called a SAMPLING
PROTOCOL. A sampling protocol is essentially a recipe, and serves the same purpose. It
enables other scientists to duplicate someones work exactly, without any confusion or
uncertainty.

Why is this important? Imagine a scientist claims to find a link between diabetes and
saturated fat intake. In order to be sure that the link is real, many other scientists will need
to duplicate the study. But, if they dont know exactly how the study was conducted, how
can they repeat it? Further, if a scientist isnt precise and detailed when they write out
their sampling protocol, how can anyone be certain that they didnt make any mistakes?

Sampling protocols enable scientists to check their work, and the work of others, for
mistakes. Sampling protocols also enable scientists to duplicate the work of others, to see
if the results are accurate and consistent.

How detailed does a sampling protocol have to be? More detail is always better than less.

For example, suppose you wanted to measure the height of you and several friends.
Simple, right? You would just.well.DO IT, wouldnt you? Not so fast! If you want to do
it scientifically, its more complicated. Would you use a tape measure, or meter stick?
How should the person stand while being measured, and how can you describe it in
words, so that someone else could do it EXACTLY the same way? What about shoes,

Page 8 of 27
hats, and big hair? How would you find the exact top of the persons head: with a flat
book, or something else? How would you hold a book on their head? How would you
mark off the exact height, and what should you round the number to? What if some of
your friends cant stand?

These are just a few of the questions you would have to consider, and describe in detail,
in your sampling protocol for measuring heights. Table 5 lists essential components for
every sampling protocol.

Table 5: Essential components of a sampling protocol.


1) List all EQUIPMENT that will be used, with enough DETAIL so that someone else could use
the same equipment, with no guessing.

2) Indicate how ACCURATE your measurements will be


(for example, if taking length measurements, will you measure to the nearest meter, cm, or
mm?)

3) Indicate how you will handle ROUNDING numbers


(the standard method is to round down for numbers below half of a whole number, and to
round up for numbers half or more. For example, 1.3 would round to 1.0, while 1.5 would
round to 2.0)

4) Write out a series of DETAILED STEPS, indicating every aspect of the sampling process.
Provide enough detail so that a stranger, with no knowledge of your study at all, could read
your protocol and recreate it EXACTLY, with no guessing.

For most of your labs, the sampling protocols will be clearly defined and described for
you. But not yet. Here, you need to develop, describe, and follow your own sampling
protocol for measuring hair lengths. The data you gather will be necessary for the rest of
this lab.

Measuring hair lengths isnt particularly exciting, but its an easy way to become familiar
with sampling protocols, metric measurements, and methods of data analysis. You will
need these skills for the rest of this course.

Procedure:
1) You will be measuring the lengths of hairs to the NEAREST MILLIMETER, and
comparing these lengths, from two different sampling locations. Choose one of the
following comparisons to make. Youll probably want to choose one thats easy for you to
gather samples.

- arm hair versus leg hair


- chest hair versus leg hair
- eyebrow hair versus arm hair
- hair (fur) growing from the back of a pet (dog, cat, horse, etc.) versus hair growing from
the belly

Page 9 of 27
2) For each of the two sampling locations that youll be comparing, you will need to
measure the lengths of TEN different hairs to the nearest mm. So, if youre comparing the
length of eyebrow hair to arm hair, youll need to measure the lengths of ten eyebrow
hairs, and ten arm hairs.

To make your measurements, you MUST use a metric ruler, and measure to the
NEAREST MILLIMETER. If you don't own a metric ruler, you need to borrow or buy one.
Take a close look at your metric ruler. The numbered lines should be a little less than
one-half inch across. These are centimeters.

Each centimeter should contain ten smaller, unlabeled lines, with the fifth line being
slightly longer than the others. These unlabeled lines, each about the thickness of a dime,
are millimeters.

You need to be familiar with centimeters and millimeters and you must use a metric ruler.
You CAN'T use inches. Figure 1 compares the relative size of an inch to centimeters and
millimeters.

Figure 1. Centimeters and millimeters


compared to an inch. This image is not
to scale with actual inches and
centimeters.

3) Using Table 5 as a guide, write out a detailed, accurate, and comprehensive sampling
protocol in your Workbook. You may want to practice different techniques before you
decide on one that works best. Be sure to include everything in Table 5. Be sure to
include exactly how you gathered your hairs, as well as how you measured them. Your
protocol can be in list form. College-level grammar and logic are essential for full points.

4) Using your protocol, measure the lengths of ten hairs from each of your two sample
locations. Record your measurements in your Workbook, including your UNITS, and the
IDENTITY of each sampled location for full points.

5) KEEP THE HAIRS that you measured. Your Instructor may require you to send in the
hairs to verify that you followed your protocol correctly.

Page 10 of 27
Exercise 7: Descriptive Statistics: Sample Size
Take a look at the hair lengths that you just recorded. These numbers are called RAW
DATA. Can you see any kinds of trends or patterns in your data? Probably not.

People are not very good at looking at a bunch of numbers, and figuring out what they
mean. This is a problem. Some scientific studies create thousands of pieces of raw data.
Imagine looking at TEN THOUSAND NUMBERS, and trying to figure out what they mean.

We need to DO SOMETHING to the raw data to help us notice any interesting trends and
patterns. Essentially, we need to SIMPLIFY the data, so our brains can handle them. We
do this with DESCRIPTIVE STATISTICS, which are numerical ways to summarize data.

There are many different kinds of descriptive statistics that can be useful. Commonly,
three DIFFERENT KINDS of statistics are reported in scientific studies: Sample Size, a
Measure of Central Tendency, and a Measure of Variability.

The simplest descriptive statistic is the SAMPLE SIZE, which is often abbreviated n (an
italicized letter n). The sample size is a count of the total number of items from which you
took measurements. The Sample Size comes from the ITEM THAT YOU MEASURE, not
the MEASUREMENTS that you take. This can cause confusion, so be careful!

Look over the examples below. Make sure you understand them before continuing.

What What item was What is the


measurements measured? Sample Size?
were taken?
a) A team of four students lengths fingers 40 fingers
measures the length of all ten
fingers on each student.

b) A dog breeder measures the weights dogs 100 dogs


weights of 100 Dalmatian dogs, number of spots
the number of spots on each kinds of food
dog, and the kinds of food each
dog eats during a week.

c) 200 mice were examined for presence or mice 200 mice


skin tumors. 151 of the mice absence of
had tumors, and there were a tumors, and
total of 175 tumors. number of
tumors

Notice Example c in particular. The Sample Size is 200 mice. The Sample Size is NOT
151 accounts of tumors, and it is NOT 175 tumors. The MEASUREMENTS that were

Page 11 of 27
taken were 151 accounts of tumors, and 175 tumors. The ITEMS that were measured
were the original 200 mice.

Practice Questions:
Determine the Sample Size for each of the following. Be sure to include the UNITS in
your answer. Check your answers below.

HINT: First determine what measurements were taken, then determine what item was
measured. Realize that the MEASUREMENTS taken, and the ITEM thats measured,
cant be the same thing.

1) A student measures the thickness of 335 hairs on her head.


2) A group of 67 students each measures the thickness of five hairs on their heads, for a
total of 335 hairs measured.
3) In a cancer study, 342 tumors were examined in mice. All tumors were between 2 and
5 mm in width, and 165 were malignant.
4) Earth is 4 billion years old. Its circumference is 40,234 km, and its moving around the
sun at 107,826 km per hour.
5) The kinds of boats, and the number of boats, were surveyed on 10 different lakes.
There were 32 sailboats, 50 canoes, and 1 paddle boat.

Answers to Practice Questions:


1) The measurements were "thicknesses". The items measured were "hairs". The sample
size is "335 hairs".
2) The measurements were "thicknesses". The items measured were "hairs". The sample
size is "335 hairs". The other information is irrelevant.
3) The measurements were "widths", and "malignancy status". The items measured were
"tumors". The sample size is "342 tumors".
4) The measurements were "age", "circumference", and "speed". The items measured
was "Earths". The sample size is "1 Earth".
5) The measurements taken were "kinds of boats", and "number of boats". The items
measured were "lakes". The sample size is "10 lakes". The other information is irrelevant.

Workbook Questions:
There are no Workbook Questions for this Exercise. However, if you don't understand the
Practice Questions, you may have difficulty on later portions of this lab.

Exercise 8: Descriptive Statistics: Measures of Central Tendency


Raw data often tend to cluster around some middle value, with only a few values that
are much larger or much smaller. For example, if all students at the College of Southern
Nevada lined up by height, most students would probably be close to some middle
height, with just a few students that were very tall, or very short.

Page 12 of 27
Since many kinds of raw data behave this way, it is useful to have some way to identify
the middle value. MEASURES OF CENTRAL TENDENCY are descriptive statistics that
find the middle value in a set of raw data. There are many different measures of central
tendency, although the best one is usually the MEAN. The Mean is also called the
AVERAGE.

The Mean is easy to calculate. Simply add up all of the raw data, then divide by the
Sample Size.

Practice Questions:
Answer the following questions. Round answers to tenths. Be sure to include UNITS in
your answers. Check your answers below. You can calculate Means by hand, with a
calculator, with a spreadsheet program such as Excel, or with an online calculator such
as www.easycalculation.com.

1) Eight students took a 100-point exam. Their scores were 56, 75, 99, 43, 55, 65, 12,
and 50 points. What was the Mean score?
2) Three students each measure the lengths of 2 hairs on their heads. The lengths were:
10, 11, 30, 36, 16, and 19 cm. What is the average hair length?
3) Mercury is about 58 Gigameters from the sun. Pluto is about 5,913 Gigameters from
the sun. What is the Mean distance from the sun?
4) The Earth is about 150 Gigameters from the sun. Is it closer to the sun than the Mean
distance (from Question 3), or farther from the sun than the Mean distance?

Answers to Practice Questions:


1) 56.9 points
2) 20.3 cm
3) 2,985.5 Gm
4) Closer

Workbook Questions:
There are no Workbook Questions for this Exercise. However, if you don't understand the
Practice Questions, you may have difficulty on later portions of this lab.

Exercise 9: Descriptive Statistics: Measures of Variability


Look at these two sets of raw data: - Data Set A: 1, 2, 50, 98, 99: Mean = 50
- Data Set B: 48, 49, 50, 51, 52: Mean = 50

The Mean of Data Set A is 50. Notice that the raw data are quite variable. They go all the
way down to 1, and all the way up to 99. The Mean of Data Set B is also 50, but the raw
data are NOT very variable. They only go down to 48, and only up to 52.

In other words, the VARIABILITY of Set A is greater than the variability of Set B.

Page 13 of 27
MEASURES OF VARIABILITY indicate how variable raw data are. They indicate whether
all of the raw data values are SIMILAR to each other, or are all DIFFERENT from each
other.

How can we measure variability? The easiest measurement to calculate is the RANGE.
It's simply the highest number in the data set minus the lowest number. So, the Range for
Data Set A is 99-1, which is 98. The Range for Data Set B is 52-48, which is 4. Since
Data Set A has a larger Range, we can say that is it more VARIABLE than Data Set B.

The Range is easy to calculate, but for a variety of mathematical reasons, it's not a very
good measure of variability. A better measure of variability is something called the
Average Deviation. The Average Deviation is the AVERAGE amount that points in a data
set DEVIATE from the MEAN of that data set. For example, let's look at Data Set B again:

Data Set B: 48, 49, 50, 51, 52: Mean = 50

How far does each point in the set deviate from the mean? Let's take a look.

Data Point Mean How far Away?


48 50 2 (in other words, its two points away from the mean)
49 50 1
50 50 0
51 50 1
52 50 2

We have now figured out how much each data point DEVIATES from the mean. These
new numbers are our DEVIATIONS. Now, we just need to take the AVERAGE of these
deviations, which is:

(2+1+0+1+2) divided by 5 = 1.2 points.

1.2 is the Average Deviation from the mean. Our mean for this example was 50. So, on
average, our data points deviate from 50 by 1.2 points.

In other words, on average, our data points go DOWN from the mean to about 48.8 (i.e.
50-1.2), and UP from the mean to 51.2 (i.e. 50+1.2). Or, to say it another way, both 48.8
and 51.2 represent data points which deviate from the mean by an average amount in this
set of data.

So, 50 is the best measurement of a middle point for this set of data, while 48.8 and 51.2
represent good measurements of how much the other data points DEVIATE from this
mean. Since 48.8 and 51.2 are pretty close to 50, we can see that our data points dont
deviate very far from the mean. In other words, this data set isnt very variable.

Page 14 of 27
Now you understand the statistic called the AVERAGE DEVIATION FROM THE MEAN,
and the logic behind it. The AVERAGE DEVIATION from the mean is a good measure of
variability.

But wait! There are some mathematical reasons why the Average Deviation isnt the
BEST way to measure variability. Instead, there is something called the STANDARD
DEVIATION. It is ALMOST the same as the Average Deviation. The logic behind it is the
same. But, theres a bit of mathematical trickery that goes on when calculating the
Standard Deviation, and its beyond the scope of this course.

Just take it for granted that the Standard Deviation is better at capturing the concept of
variability than the Average Deviation, even though its VERY similar. The Standard
Deviation is too complicated to calculate by hand. Luckily, many calculators can do it
for you. All you need to remember is the LOGIC behind what the Standard Deviation
means.

Practice Questions:
Answer the following questions. Round answers to tenths. Be sure to include units in your
answers. You cant calculate the Standard Deviation by hand. You need to use a
calculator that has a Standard Deviation function, a spreadsheet program such as Excel,
or an online calculator such as www.easycalculation.com. Check your answers below.

Ten students in a college Biology class are the following ages: 17, 20, 20, 21, 18, 18, 18,
18, 20, and 17 years.

1) What is the Sample Size?


2) What is the Mean age for these students?
3) What is the Standard Deviation in age for these students?
4) On average, ages in this class go down from the mean to about _______ years.
5) On average, ages in this class go up from the mean to about ______ years.
6) Considering that college students can be any age from early teens to late in life, Would
you say that the ages of these students in this particular class ARE very variable, or
are NOT very variable?

A professor asked each one of his students to record the time they spent studying for the
final exam. He summarized the Raw Data with the following statistics: n = 35 students,
Mean time spent studying = 5 hours. On average, study time in this class went up to
about 9.5 hours, and down to about 0.5 hours.

7) What is the Sample Size?


8) What is the Standard Deviation?
9) Would you say that the study times of these students ARE very variable, or are NOT
very variable?

Answers to Practice Questions:


1) ten students

Page 15 of 27
2) 18.7 years
3) 1.4 years
4) 17.3 years
5) 20.1 years
6) The ages in this class are not very variable.
7) 35 students
8) 4.5 hours. How can you figure this out? If the average study time is 5 hours, and 5 plus
one Standard Deviation is 9.5 hours, then the Standard Deviation must be 4.5 hours.
Additionally, if 5 minus one Standard Deviation is 0.5 hours, then this also indicates
that the Standard Deviation must be 4.5 hours.
9) The study time is very variable. Although students studied on average 5 hours, some
studied for almost twice as long, while others studied for only half an hour.
Incidentally, this example shows that time is not always measured using the Metric
system, even by scientists. Scientists tend to use the Metric system for measurements
less than a second (such as milliseconds), but not for measurements greater than a
second. Scientists still use terms like minutes and hours. Its very rare that youll hear
anyone talk about kiloseconds or Megaseconds.

Workbook Questions:
There are no Workbook Questions for this Exercise. However, if you don't understand the
Practice Questions, you may have difficulty on later portions of this lab.

Exercise 10: Summarizing Your Hair Length Data With Statistics

Now its time to calculate Descriptive Statistics for your hair length data. Here is a
summary of the Statistics:

Statistic Definition Further Interpretation

Sample The total number of items from which No further interpretation.


Size (n) you took measurements

Mean A measure of central tendency Indicates the middle of a set of raw data.
Also called the average.

Standard A measure of variability Indicates how far, on average, the data


Deviation points deviate from the mean.

Procedure:
1) For each of your two Sample Locations separately, calculate the Sample Size, the
Mean, and the Standard Deviation. Record your answers in your Workbook. Make sure

Page 16 of 27
you indicate the IDENTITY of each sampling location, and indicate your UNITS. Round
fractional numbers to tenths.
Look at your statistics, and answer the following questions. Be sure to include the UNITS
in your answers. Record your answers in your Workbook.

2) Which kind of hair (Sample 1 or Sample 2) has a greater mean length?


3) On average, the lengths of hairs from Sample Location 1 go UP from the mean to a
length of about _______ .
4) On average, the lengths of hairs from Sample Location 1 go DOWN from the mean to a
length of about _______ .
5) On average, the lengths of hairs from Sample Location 2 go UP from the mean to a
length of about _______ .
6) On average, the lengths of hairs from Sample Location 2 go DOWN from the mean to a
length of about _______ .

Exercise 11: Figures


Summarizing Raw Data with Descriptive Statistics helps us to UNDERSTAND the data.
Descriptive Statistics can take a large volume of unwieldy numbers, and reduce them
down into a much smaller set of numbers that describe important aspects of the data.

Still, this is not enough. Even a small set of numbers can be confusing. People simply
arent good at interpreting numbers. On the other hand, people are excellent at
interpreting visual information. Have you ever heard the phrase a picture is worth a
thousand words? Well, a picture is also worth about a thousand numbers.

Because humans are so adept with pictures, VISUAL SUMMARIES of Raw Data are
usually used to complement Descriptive Statistics. These visual summaries are generally
called FIGURES. Often, Figures are actually summaries of the descriptive statistics
themselves.

Suppose we have the following statistics for a data set:

n = 34, mean = 15, standard deviation = 2.5

To summarize this with a FIGURE:


- We use a vertical numerical scale (called the y-axis).
- This scale ALWAYS starts at zero, and goes up a bit farther than our largest statistic
value.
- The Mean on this scale is represented with a DOT, and the Standard Deviation is
represented with LINES, which go up from the Mean one Standard Deviation, and down
from the Mean one Standard Deviation.
- The Sample Size (n) is indicated underneath the Figure.

Page 17 of 27
So, if we have these statistics: n = 34, mean = 15, standard deviation = 2.5

We can summarize the statistics in this Figure:

Practice Questions:
Answer the following questions, which refer to the Statistics and Figure above. Check
your answers below.

1) Why does the y-axis for this Figure begin with the number zero, instead of some other
number?
2) Why does the y-axis for this Figure end at the number 20, instead of a smaller number,
such as 16?
3) Why does the y-axis for this Figure end at the number 20, instead of a large number,
such as 100?
4) How is the Mean represented in this Figure, and what number does it sit next to on the
y-axis?
5) How is the Standard Deviation represented in this Figure?
6) Look at the line coming out of the top of the dot. What number does it sit next to on the
y-axis, and why?
7) Look at the line coming out of the bottom of the dot. What number does it sit next to on
the y-axis, and why?

Answers to Practice Questions:


1) Y-axes always start at zero. Its a rule. Starting at a different number would create a
misleading Figure.
2) The standard deviation bar would go off the edge of the Figure.
3) This would squash the Figure into a small area, and make it difficult to read. A longer y-
axis is not needed.
4) A dot, 15
5) Lines coming out of the dot
6) 17.5, this is one standard deviation up from the mean
7) 12.5, this is one standard deviation down from the mean

Page 18 of 27
We can put data from MORE than one data set in the same Figure. To do this:

- We add a horizontal X-AXIS, which identifies the separate data sets.


- In addition, Figures always have a Figure TITLE below the Figure itself.
- The Figure Title is composed of a NUMBER, along with a DESCRIPTION.
- The description is a sentence or two long, describes the Figure, and contains the
SAMPLE SIZES for the data sets.
- Finally, a short DESCRIPTION is added to the y-axis, so that we know what the
numbers on that axis represent.

Suppose we have the following statistics for two 100-point exams, taken by Biology 100
students:

Exam 1: n = 35 students, mean = 76, standard deviation = 9.8


Exam 2: n = 34 students, mean = 68, standard deviation = 10.1

Our Figure for these statistics would be the following. (Its called Figure 2, because there
was already a Figure 1 in this lab, back in Exercise 6.)

Figure 2. Means and standard deviations for


Exam 1 (n = 35 students), and Exam 2 (n = 34 students)

Practice Questions:
Answer the following questions, which refer to the Statistics and Figure below. Check
your answers below.

1) The Standard Deviation lines for Exam 1 go up to _____ points and down to _____
points.
2) The Standard Deviation lines for Exam 2 go up to _____ points and down to _____
points.

Page 19 of 27
3) If you could only see Figure 2, and NOT the original statistics used to create it, would
you be able to determine the EXACT values for the means and standard deviations?
Or, would your numbers simply be ESTIMATES?

Answers to Practice Questions:


1) 85.8, 66.2
2) 78.1, 57.9
3) estimates

Workbook Questions:
There are no Workbook Questions for this Exercise. However, if you don't understand the
Practice Questions, you may have difficulty on later portions of this lab.

Exercise 12: Choosing and Using Graphics Software


In order to create Figures for this course, you can print out the blank axes and draw your
graph by hand (then submit a photo or scan of your graph) or you can use a graphics
software program. If you use a graphics software program, you will copy and paste the
blank axes into the program. If you havent used one before, you will need to SPEND
SOME TIME practicing with it, until you are proficient.

Most PC users already have a free program called Paint installed on their computer. You
should be able to find it by clicking on the Start Menu, then using the Search function.
The icon for the program is a cup with pencils and paint brushes in it. Once youve found
Paint, you might want to right-click on it and put a bookmark for it on your desktop, so you
can easily return to it in the future.

If your PC doesnt have Paint, or if you want to use more powerful software, you can
download a free program called Paint.NET. This is available at www.getpaint.net (see
instructions in the syllabus).

If you have access to other graphics programs and know how to use them, thats fine too.
However, you CAN NOT use a graphing program for this course. A graphing program
automatically creates a Figure after you type in your data.

Once youve got a graphics program up and running, play around with it. Use whatever
help menus are available with the program to Figure out its various tools and abilities.

Procedure:
1) In the same location as this Investigation online, there is a file called Blank Axes.
Open this file with your graphics software. You should see a y-axis and x-axis, similar to
Figure 2 in Exercise 11 above, but without any data.

Page 20 of 27
2) Use the Blank Axes to recreate, as closely as possible, Figure 2 from Exercise 11.
Your software will have tools that allow you to draw straight lines, and circles. Your
software will also allow you to type text. Do not try to draw lines, circles, or text freehand.
Find, and learn, the correct tools.

3) Finally, make sure you can save file in either JPEG format, or GIF format. When you
click to save something, you should get a drop-down menu with these format choices.

If you have difficulty, use your graphics programs help menus. You can also ask your
classmates and/or Instructor for help.

Once youre able to recreate a Figure that resembles Figure 2 from Exercise 11 above,
then you can continue with the next Exercise.

Exercise 13: Creating a Figure For Your Own Data


In Exercise 10, you summarized your hair length data with Statistics. Now you know how
to summarize these Statistics further with a Figure.

Procedure:
1) Use the Blank Axes file to create a Figure for your hair length Statistics from Exercise
10. You must draw your own Figure. You can't use software that automatically draws
Figures.

An accurate, complete Figure must have all of the following:


- Indicate means with dots, standard deviations with lines.
- Add evenly-spaced numbers to the y-axis, and use as much of the y-axis as you can.
- Add a description to the y-axis, which explains what the numbers are.
- Label the data categories on the x-axis.
- Add a Figure Title below the Figure. The Title must include the word Figure, a
number, and a description. The description must be a sentence or two long, it must
describe the Figure, and it must include the sample size (with units!) for each
separate data category.

2) Save your Figure as a JPEG or GIF file. Then, insert the Figure into the appropriate
place in your Workbook. There should be an Insert Picture command somewhere in
your Word Processing Softwares menus. Or, you can try cutting and pasting your
Figure into your Workbook.

Exercise 14: Determining Whether Groups are Meaningfully Different

Page 21 of 27
This Investigation has covered a large amount of information. You now understand the
basic set of tools that all scientists use. This set of tools includes the Scientific Method,
the Metric System, Sampling Protocols, Descriptive Statistics, and Figures.

Theres one more essential component to the process of science. How does a scientist
know if their results are meaningful?

For example, let's revisit Figure 2 from Exercise 11:

Figure 2. Means and standard deviations for


Exam 1 (n = 35 students),
and Exam 2 (n = 34 students)

The results in Figure 2 seem to show that students, on average, did a bit better on Exam
1, and that the scores were a bit more variable for Exam 2. From this, you might be
tempted to guess that Exam 2 was more difficult, or that students didnt study as hard for
it.

But, theres another possibility. Perhaps the differences between Exam 1 and Exam 2 are
simply the result of RANDOM VARIATION, and dont indicate that there is any
IMPORTANT difference between the Exams or the students study habits. In other words,
perhaps the differences between Exam 1 and Exam 2 are actually MEANINGLESS.

How can we tell whether these results indicate that there ARE meaningful differences
between Exam 1 and Exam 2, or that there ARE NOT meaningful differences?

This is an enormous problem in most scientific work, and the answer is not
straightforward. In fact, the entire branch of mathematics known as Statistics is largely
devoted to answering this question.

There are many specialized statistical tests which will determine whether results indicate
meaningful differences between groups or not. These tests are beyond the scope of this
course.

However, we can use a simple, largely accurate set of rules to make decisions in this
course. Here are the rules:

- If the Standard Deviation lines on a Figure DO NOT OVERLAP, or if they overlap by


only a TINY AMOUNT, then the groups probably ARE meaningfully different.

Page 22 of 27
- If the Standard Deviation lines on a Figure OVERLAP BY MORE THAN A TINY
AMOUNT, then the groups probably are NOT different in any meaningful way.

For example, lets look at Figure 2 again:

Notice that the Standard Deviation lines for Exam 1 and Exam 2 overlap one another
almost completely. In other words, if you drew the Statistics for Exam 1 and Exam 2 in the
same space, rather than separating them along the x-axis, they would overlap almost
completely:

This indicates that the scores on Exam 1 and Exam 2 are NOT MEANINGFULLY
DIFFERENT. Students did just as well on Exam 2 as on Exam 1.

Why do overlapping Standard Deviations indicate that groups aren't meaningfully


different? It's beyond the scope of this course to explain, but you can ask your instructor
for a discussion if you'd like.

Suppose that the scores on Exam 1 and 2 had looked more like this:

Page 23 of 27
The Standard Deviations for THESE two Exams dont overlap at all. If you drew them in
the same space, the Statistics for the two Exams would be completely non-overlapping:

In this case, the differences between the two Exams WOULD BE MEANINGFUL. They
would indicate that students did considerably worse on Exam 2.

Its standard practice in scientific studies to use the term significant instead of
meaningful. However, the use of this word indicates that special statistical tests have
been performed on the data to determine whether they are meaningful or not. Since we
cant actually do those tests here, well stick with the word meaningful.

Practice Questions:
Examine the following Figures, which represent Means and Standard Deviations for two
different groups. For each one, indicate whether the groups are meaningfully different or
not. You DONT need any further information, such as the actual scale on the y-axis, to
determine whether the groups are meaningfully different. Check your answers below.

1) 2) 3)

Page 24 of 27
4) 5)

Answers to Practice Questions:


1) Not meaningfully different
2) Yes, the groups are meaningfully different
3) Yes, the groups are meaningfully different, because the standard deviations only
overlap a tiny bit
4) Not meaningfully different
5) Not meaningfully different

Workbook Questions:
There are no Workbook Questions for this Exercise. However, if you don't understand the
Practice Questions, you may have difficulty on later portions of this lab.

Exercise 15: Are Your Groups Meaningfully Different?


Its time to examine your hair length data yet again.

Workbook Question:
Look at the Figure you created for your hair length Statistics. Answer the following
question.

1) You sampled hairs from two different locations. Do the hairs from those two locations
have meaningfully different lengths? Simply answer YES or NO in your Workbook.

Exercise 16: Quiz Data

For the quiz you will be asked to answer questions about the following data on
lengths of hairs taken from two different locations on a dog:

Page 25 of 27
Sample Location 1: Sample Location 2:
Dogs Back Dogs Belly

Hair
Lengths:
1 60 mm 45 mm
2 63 mm 48 mm
3 58 mm 58 mm
4 69 mm 42 mm
5 65 mm 51 mm
6 72 mm 52 mm
7 71 mm 35 mm
8 64 mm 54 mm
9 59 mm 56 mm
10 65 mm 41 mm

For this data, answer the following questions.

1. Summarize the Quiz Data with Statistics by completing the table in the
workbook.

2. Construct a graph of the Quiz Data.

3. Is there a meaningful difference in the hair length from Location 1 and the
hair length from Location 2?

Page 26 of 27
Credits:

This lab was developed by Professor Brian Wainscott, and modified by Professor Dawn
Nelson.

Page 27 of 27

You might also like