You are on page 1of 5

Analyzing the Data

How to analyze your survey data like a survey scientist


Now that youve collected your survey results and have a data analysis plan,
its time to dig in and analyze the data. Heres how our Survey Research
Scientists make sense of quantitative data (versus making sense of qualitative
data), from looking at the answers and focusing on their top research questions
and survey goals, to crunching the numbers and drawing conclusions.

Take a look at your top research questions


First, lets talk about how you analyze the results for your top research
questions. Remember that you outlined your top research questions when you set a goal for your survey.
For example, if you held an education conference and gave attendees a post-event feedback survey, one of your
top research questions may look like this:
How did the attendees rate the conference overall?
Now take a look at the answers you collected for a specific survey question that speaks to that top research
question:

Notice that in the responses, youve got some percentages (71%, 18%) and some raw numbers (852, 216).
The percentages are just thatthe percent of people who gave a particular answer. Put another way, the
percentages represent the number of people who gave each answer as a proportion of the number of people who
answered the question.
So, 71% of your survey respondents (852 of the 1,200 surveyed) plan on coming back next year.
This table also shows you that 18% say they are planning to return and 11% say they are not sure.
The raw numbers are the number of individual survey respondents who gave each answer. So 852 people said,
Yes, Im coming back next year!
If you assume that most of the people who said yesand maybe some of those who said they were not sureare
coming next year, you can build a forecasting model to estimate the number of people* who will attend next
years conference.
*You can determine this number with more confidence if you had a very high participation rate, meaning most
of the people who attended the conference and received your survey filled it out.

Cross-tabulating and filtering results


Recall that when you set a goal for your survey and developed your analysis plan, you thought about what
subgroups you were going to analyze and compare. Now is when that planning pays off.
For example, say you wanted to see how teachers, students, and administrators compared to one another in
answering the question about next years conference. To figure this out, you want to create a cross tabulation,
showing the results of the conference question by subgroup:

From this table you see that a large majority of the students (86%) and teachers (80%) plan to come back next
year. However, the administrators who attended your conference look different, with under half (46%) of them
intending to come back! Hopefully, some of our other questions will help you figure out why this is the case and
what you can do to improve the conference for administrators so more of them will return year after year.
Using a filter is another useful tool for analyzing data. Filtering means narrowing your focus to one particular
subgroup, and filtering out the others. So instead of comparing subgroups to one another, here were just looking
at how one subgroup answered the question.
For instance, you could limit your focus to just women, or just men, then re-run the crosstab by type of attendee to
compare female administrators, female teachers, and female students. One thing to be wary of as you slice and
dice your results: Every time you apply a filter or cross tab, your sample size decreases. Ti make sure your results
are statistically significant, it may be helpful to use a sample size calculator.

Benchmarking, trending, and comparative data


Lets say on your conference feedback survey, one key question is, Overall how satisfied were you with the
conference? Your results show that 75% of the attendees were satisfied with the conference. That sounds pretty
good. But wouldnt you like to have some context? Something to compare it against? Is that better or worse than
last year? How does it compare to other conferences?
Well, say you did ask this question in your conference feedback survey after last years conference. Youd be able
to make a trend comparison. Professional pollsters make poor comedians, but one favorite line is trend is your
friend.
If last years satisfaction rate was 60%, you increased satisfaction by 15 percentage points! What caused this
increase in satisfaction? Hopefully the responses to other questions in your survey will provide some answers.
If you dont have data from prior years conference, make this the year you start collecting feedback after every
conference. This is called benchmarking. You establish a benchmark or baseline number and, moving forward,
you can see whether and how this has changed. You can benchmark not just attendees satisfaction, but other
questions as well. Youll be able to track, year after year, what attendees think of the conference. This is called
longitudinal data analysis.
Learn more about how SurveyMonkey Benchmarks can help give your survey results context.

What is longitudinal analysis?


Longitudinal data analysis (often called trend analysis) is basically tracking how findings for specific questions
change over time. Once a benchmark is established, you can determine whether and how numbers shift. Suppose
the satisfaction rate for your conference was 50% three years ago, 55% two years ago, 65% last year, and 75%
this year. Congratulations are in order! Your longitudinal data analysis shows a solid, upward trend in satisfaction.
You can even track data for different subgroups. Say for example that satisfaction rates are increasing year over
year for students and teachers, but not for administrators. You might want to look at administrators responses to
various questions to see if you can gain insight into why they are less satisfied than other attendees.

Crunching the numbers


You know how many people said they were coming back, but how do you know if your survey has yielded
answers that you can trust and answers that you can use with confidence to inform future decisions? Its
important to pay attention to the quality of your data and to understand the components of statistical significance.
In everyday conversation, the word significant means important or meaningful. In survey analysis and statistics,
significant means an assessment of accuracy. This is where the inevitable plus or minus comes into survey
work. In particular, it means that survey results are accurate within a certain confidence level and not due to
random chance. Drawing conclusions based on results that are inaccurate (i.e., not statistically significant) is
risky.
The first factor to consider in any assessment of statistical significance is the representativeness of your sample
that is, to what extent the group of people who were included in your survey look like the total population of
people about whom you want to draw conclusions.
You have a problem if 90% of conference attendees who completed the survey were men, but only 15% of all
your conference attendees were male. The more you know about the population you are interested in studying, the
more confident you can be when your survey lines up with those numbers. At least when it comes to gender,
youre feeling pretty good if men make up 15% of survey respondents in this example.
If your survey sample is a random selection from a known population, statistical significance can be calculated in
a straightforward manner. A primary factor here is sample size. Suppose 50 of the 1,000 people who attended
your conference replied to the survey. Fifty (50) is a small sample size and results in a broad margin of error. In
short, your results wont carry much weight.

Say you asked your survey respondents how many of the 10 available sessions they attended over the course of
the conference. And your results look like this:

You might want to analyze the average. As you may recall, there are three different kinds of averages: mean,
median and mode.
In the table above, the average number of sessions attended is 6.3. The average reported here is the mean, the kind
of average thats probably most familiar to you. To determine the mean you add up the data and divide that by the
number of figures you added. In this example, you have 10 people saying they attended one session, 50 people for

four sessions, 100 people for five sessions, etc. So, you multiply all of these pairs together, sum them up, and
divide by the total number of people.
The median is another kind of average. The median is the middle value, the 50% mark. In the table above, we
would locate the number of sessions where 500 people were to the left of the number and 500 to the right. The
median is, in this case, 7 sessions. This can help you eliminate the influence of outliers, which may adversely
affect your data.
The last kind of average is mode. The mode is the most frequent response. In this case the answer is six. 260
survey participants attended 6 sessions, more than attended any other number of sessions.
Meansand other types of averagescan also be used if your results were based on Likert scales.

Drawing conclusions
When it comes to reporting on survey results, think about the story the data tells.
Say your conference overall got mediocre ratings. You dig deeper to find out whats going on. The data show
that attendees gave very high ratings to almost all the aspects of your conference the sessions and classes, the
social events, and the hotel but they really disliked the city chosen for the conference. (Maybe the conference
was held in Chicago in January and it was too cold for anyone to go outside!) That is part of the story right there
great conference overall, lousy choice of locations. Miami or San Diego might be a better choice for a winter
conference.
One aspect of data analysis and reporting you have to consider is causation vs. correlation.

What is the difference between correlation and causation?


Causation is when one factor causes another, while correlation is when two variables move together, but one does
not influence or cause the other.
For example, drinking hot chocolate and wearing mittens are two variables that are correlated they tend to go
up and down together. However, one does not cause the other. In fact, they are both caused by a third factor, cold
weather. Cold weather influences both hot chocolate consumption and the likelihood of wearing mittens.
Cold weather is the independent variable and hot chocolate consumption and the likelihood of wearing mittens are
the dependent variables.
In the case of our conference feedback survey, cold weather likely influenced attendees dissatisfaction with the
conference city and the conference overall.
Finally, to further examine the relationship between variables in your survey you might need to perform a
regression analysis.

What is regression analysis?


Regression analysis is an advanced method of data analysis that allows you to look at the relationship between
two or more variables. There a many types of regression analysis and the one(s) a survey scientist chooses will
depend on the variables he or she is examining. What all types of regression analysis have in common is that they
look at the influence of one or more independent variables on a dependent variable.
In analyzing our survey data we might be interested in knowing what factors most impact attendees satisfaction
with the conference. Is it a matter of the number of sessions? The keynote speaker? The social events? The site?
Using regression analysis, a survey scientist can determine whether and to what extent satisfaction with these

different attributes of the conference contribute to overall satisfaction. This, in turn, provides insight into what
aspects of the conference you might want to alter next time around.
Say, for example, you paid a high honorarium to get a top flight keynote speaker for your opening session.
Participants gave this speaker and the conference overall high marks. Based on these two facts you might think
that having a fabulous (and expensive) keynote speaker is the key to conference success.
Regression analysis can help you determine if this is indeed the case. You might find that the popularity of the
keynote speaker was a major driver of satisfaction with the conference. If so, next year youll want to get a great
keynote speaker again. But say the regression shows that, while everyone liked the speaker, this did not contribute
much to attendees satisfaction with the conference. If that is the case, the big bucks spent on the speaker might be
best spent elsewhere.
If you take the time to carefully analyze the soundness of your survey data, youll be on your way to using the
answers to help you make informed decisions.

You might also like