You are on page 1of 201

KWAME NKRUMAHUNIVERSITY OF SCIENCE AND

TECHNOLOGY, KUMASI

(MSc. RENEWABLE ENERGY TECHNOLOGY, 2011/2012)

RET 560: Research Methods


[Credit: 3]

Prof. Abeeku BREW-HAMMOND

Publishers Information

IDL, 2009
All rights reserved. No part of this book may be reproduced or utilized in any form or by any
means, electronic or mechanical, including photocopying, recording or by any information
storage and retrieval system, without the permission from the copyright holders.
For any information contact:
Dean
Institute of Distance Learning
New Library Building
Kwame Nkrumah University of Science and Technology
Kumasi, Ghana

Phone:

+233-51-60013
+233-51-61287
+233-51-60023

Fax:

+233-51-60014

E-mail:

idldean@kvcit.org
idloffice@kvcit.org
kvcit@idl-knust.edu.gh
kvcitavu@yahoo.ca

Web:

www.idl-knust.edu.gh
www.kvcit.org

ISBN:
Editors:

Publishers notes to the Learners:

ii

1. Icons: -the following icons have been used to give readers a quick access to where similar
information may be found in the text of this course material. Writer may use them as and when
necessary in their writing. Facilitator and learners should take note of them.

Icon #1

Icon #2

Icon #3

Icon #4

Icon #5

Review
Learning Objective

Learning Activity

Unit Assignments

Icon #6

Icon #7

Icon #8

Icon #9

Icon #10

Group Discussion

Read

Time For Activity

Self Assessment

New Terms

Icon #11

Icon #12

Icon #13

Icon #14

Icon #15

Pause

Interactive CD

Online

Answer Tips

Note/Learning Tip

Summary

2. Guidelines for making use of learning support (virtual classroom, etc.)

This course material is also available online at the virtual classroom (v-classroom) Learning
Management System. You may access it at www.kvcit.org

iii

Course Writers
Abeeku BREW-HAMMOND
Associate Professor of Mechanical Engineering
Director of Energy Centre at KNUST
College of Engineering, KNUST

David Ato Quansah


Lecturer
Mechanical Engineering
College of Engineering, KNUST

Owusu Amponsah
Lecturer
Department of Planning
College of Architecture and Planning, KNUST

Wahib Faisal Adams


Lecturer
Mechanical Engineering
College of Engineering, KNUST

iv

Acknowledgement
The authors are indebted to Dr Gabriel Takyi, Lecturer in the Department of Mechanical
Engineering, for managing the whole of the MSc RETS e-Learning programme, including the
course materials development process.
Thanks also go to Mr Ebenezer Nyarko Kumi for invaluable assistance to Prof Abeeku BrewHammond in the writing of the second half of this document.

Course Introduction
This course forms part of the Master of Science Degree Programme in Renewable Energy
Technologies via E-Learning. It is a 3 credit-hour course with 2 hours of teaching and 2 hours
tutorial per week. The programme is hosted by the Department of Mechanical Engineering
under the auspices of The Energy Center, KNUST.

COURSE OVERVIEW
Research methods in engineering and the physical sciences: design of experiments,
Instrumentation, Data acquisition and analysis, Error analysis, mathematical modelling and
computer simulation, statistical analysis, interpretation and presentation of experimental results
and simulations; Research methodology in the social sciences: qualitative and quantitative
research, design of surveys and questionnaires, case study design, sampling and interview
techniques, analytical techniques (analysis of variance, analytic generalisation, etc); Preparation
of research proposals including thesis research design, reporting and publication of findings
(thesis writing, preparation of conference papers and journal articles, posters, etc), critical
reviews of journal papers and other publications, oral presentations using PowerPoint, Software
applications for data analysis (SPSS, STATA, etc)

COURSE OBJECTIVES
By the end of the course the student should be able to do the following:
1. Develop Methodology for Research Projects/Thesis Research involving
Engineering/Physical Science and Social Science Research Methods;
2. Write His/Her Thesis Synopsis and Research Proposals/Concept Notes; and
3. Write Journal/Conference Papers.

COURSE OUTLINE
Unit 1: Introduction to Research Proposals and Thesis Synopsis
Unit 2: Engineering Research Design and Data Analysis
Unit 3: Social Science Research Design and Data Analysis
Unit 4: Statistical Analysis with STATA and SPSS
Unit 5: Introduction to writing of Journal Articles, Conference Papers and Theses
vi

COURSE STUDY GUIDE


Week #
1
2
3
4
5
6
7
8
9
10
11
12
13
14

Unit/Session
General Introduction +
Unit 1
Unit 2
Unit 3
Unit 3 Contd
Unit 4
Unit 4 Contd
Unit 5
Unit 6
Unit 6 Contd
Unit 7

FFFS/Practical/Exam/Quiz
Take-Home Quiz/Exercise No. 1 10%

Take-Home Quiz/Exercise No. 2 10%


Tutorial to review Units 1 - 4
Take-Home Exercise No. 3 10%
Take-Home Exercise No. 4 10%
Take-Home Exercise No. 5 10%
Tutorial to review Units 5 7
Final Written Examination on All Units 60%
Mini-Project Presentation 20%

GRADING
Continuous assessment: 30%
End of semester examination: 70%

RESOURCES
You will require a basic knowledge of engineering science and mathematics as well as access to
the internet and a computer for this course.

vii

READING LIST
1. Journal Articles, Recommended Textbooks, etc.
Annabel, B.K. (2006). Using interviews as research instruments, Language Institute
Chulalongkorn University publications.
Beavon, J. R. (2009). The origins of experimental error. Retrieved August 5, 2010, from
http://home.clara.net/rod.beavon/err_orig.htm
Becker, H. S. and Pamela, R. (Eds) 1986. Writing for Social Scientist: How To Start And Finish
Your Thesis, Book, Or Article. London: University of Chicago Press Ltd.
Bell, J., (2004) (3rd edn) Doing Your Research Project: A Guide for First -time Researchers in
Educational and Social Science, UK: Open University Press.
Bell, J. (2004). Doing Your Research Project, A Guide for First-time Researchers in Education
and Social Science, 3rd edn. Berkshire, UK, Open University Press.
Bell, J. (2010). Doing Your Research Project: a Guide For First-time Researchers in Education
and Social Science. 5th edn. Maidenhead: Open University Press
Brian, Allison (Eds.) 1996, 1998, 2000. Research Skills for Students. London: Kogan Page
Limited.
Chapin, P. G. (2004). Research Projects and Research Proposals; A Guide to Scientists Seeking
Funding. UK: Cambridge University Press.
Colleen, H. (2009). Researcher as Goldilocks. Bournemouth University. International Journal of
Evidence Based Coaching and Mentoring, Special Issue 3:11-19.
Dawson C. (2009). Introduction To Research Methods; A Practical Guide For Anyone
Undertaking A Research Project, 4th Edition. UK: How to Contents.
Denscombe, M. (2010) The good research guide. 4th edn. Maidenhead: Open University Press
Duane, D. (2000). Introduction to Measurements & Error Analysis. Retrieved February 12,
2012, from The University of North Carolina at Chapel Hill, Department of Physics and
Astronomy : http://www.physics.unc.edu/~deardorf/uncertainty/UNCguide.html
viii

Eade, Deborah (Ed.) 2003. Development Methods and Approaches: Critical Reflections. Oxford;
OXFAM GB.
Eric M. Uslaner December, (1999). Brief Guide to STATA Commands

emathzone. (2012). Continuous Random Variable. Retrieved Feb 2012, from emathzone:
http://www.emathzone.com/tutorials/basic-statistics/continuous-random-variable.html
Frankfort-Nachmias, C. and Nachmias, D. (1996). Research Methods in Social Science, 5th
Edition, New York, St. Martins Press Inc.
Gagnon, S. (Undated). How cold is liquid nitrogen? Retrieved from Jefferson Lab:
http://education.jlab.org/qa/liquidnitrogen_01.html
Ghanfoor A. (2006). Manual for synopsis and thesis preparation. University of Agriculture,
Faisalabad, Pakistan.
Harrison, D. M. (2008). Error Analysis in Experimental Physical Science. Retrieved September
25, 2010, from University of Toronto:
http://www.upscale.utoronto.ca/PVB/Harrison/ErrorAnalysis/
Hart, C. (1998) Doing a literature review: releasing the social science imagination. Thousand
Oaks, Sage
Harvey, G. (1998) Writing with sources: a guide for students. Indiana: Hackett Publishing
Ivan Iachine, Lars Korsholm,Henrik Stvring, Kirstin Vach, Werner Vac (2004).Stata Reference
Manual
James H. Stock and Mark W. Watson, (2003). Introduction to Econometrics
Julie Pallant, (2002). A step by step guide to data analysis using SPSS for Windows
School of Graduate Studies-KNUST. (undated). Manual for thesis preparation for Masters and
Doctoral degrees awarded by the Kwame Nkrumah University of Science and Technology.
School of Graduate Studies, KNUST, Kumasi, Ghana.
Kenneth L. Simons, (2010). Useful Stata Commands
Kumekpor, T.K.B. (2002). Research Methods and Techniques of Social Research, Accra,
SunLife Publications.
Kurt Schmidheiny, (2008). Sort Guides to Microeconometrics, Unversitat Pompeu Fabra
ix

Lazi Z. R, 2004. Design of experiments in chemical engineering: a practical guide, WILEYVCH Verlag GmbH & Co KGaA Weinheim
Lester, J. (2005) Writing research papers: a complete guide. 11th edn. New York, Longman
Manfred W. Keil, (2010). STATA 10 Tutorial
Montgomery, D. C., Runger, G. C., & Hubele, N. F. (2000). Engineering Statstics (2nd Edition
ed.). New York: John Wiley & Sons, Inc.
Moore, N. (undated). How to do Research. Third Edition. London: Library Association
Publishing.
Narasimhan, B. (1996). The Normal Distribution. Retrieved Jan 30, 2012, from Stanford
University : http://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.html
Neale, P., Thapa, S. and Boyce, C. (2006). Preparing a Case Study: a guide for designing and
conducting a Case Study for Evaluation Input, pathfinder International, Watertown,
Massachusetts.
Neville C. (2010). The Complete Guide To Referencing And Avoiding Plagiarism, 2nd edition.
UK: Open University Press.
Nsowah-Nuamah, N.N.N. (2005). A Handbook of Descriptive Statistics for Social and Biological
Sciences. Accra: Acadec Press.
Ogden, T.E. and Goldberg, I. A. (2002). Research Proposals; A Guide To Success, 3rd Edition.
USA: Academic Press
Seawright, J. and Gerring, J. (2008). Case Selection Techniques in Case Study Research : A
Menu of Qualitative and Quantitative Options, Political Research Quarterly 2008 61: 294.
Singleton, R.A., Jr. Bruce C. S. and Straits, M.M. (1993). Approaches to Social Research.
Second Edition. Oxford University Press, New York.
Steinar, K. (1996). Interviews: An Introduction to Qualitative Research Interviewing. SAGE
Publications, California.
Susan B. Gerber, Kristin Voelkl Finn, (1999). Using SPSS For Windows. New York:State
University of New York Graduate School of Education
x

Taylor, J. R. (2004). An Introduction to Error Analysis: the study of uncertainties in physical


measurements. CA: University Science Books.
The Health Communication Unit (1999). Conducting Survey Research, The Health
Communication Unit, at the Centre for Health Promotion, University of Toronto.
Urdan, T. C. (2010). Statistics in Plain English. New York: Taylor & Francis Group.
WWF (2005). Logical Framework Analysis. Retrieved on 1st February, 2012 from:
http://www.artemis-services.com/downloads/logical-framework.pdf
Zaidah, Z. (2007). Case study as a research method. Universiti Teknologi Malaysia, Jurnal
Kemanusiaan bil.9, Jun.

2. Websites, CD ROMs, etc


NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/
https://classshares.student.usp.ac.fj/EN400/2007%20Lecture%20Materials/Sections%201,%202,
%20and%203%20EN400.pdf
http://www.engr.sjsu.edu/bjfurman/courses/ME120/me120pdf/UncertaintyAnal.pdf
http://www.sonoma.edu/aa/gs/guidelines/toc.shtml
http://www.mhhe.com/mayfieldpub/tsw/toc.htm

xi

Table of Contents
Publishers Information ................................................................................................................................. i
Course Writers ............................................................................................................................................. iv
Acknowledgement ......................................................................................................................................... v
Course Introduction ..................................................................................................................................... vi
Table of Contents ........................................................................................................................................ xii
List of Tables .............................................................................................................................................. xvii
List of Figures ............................................................................................................................................ xvii
Unit 1 ......................................................................................................................................................... 1
INTRODUCTION TO RESEARCH PROPOSALS AND THESIS SYNOPSES
PREPARATION ...................................................................................................................................... 1
SESSION 1.1: CONCEPT NOTES .......................................................................................................... 2
1.1.1 Introduction to Concept Notes ..................................................................................................... 2
1.1.2

Structure of Concept Notes ................................................................................................... 2

SESSION 1.2: RESEARCH PROPOSALS .............................................................................................. 3


1.2.1 Introduction to Research Proposals.............................................................................................. 3
1.2.2 Structure of research proposals .................................................................................................... 3
1.2.3 Logical framework ....................................................................................................................... 5
1.2.4 Detailed Budget ........................................................................................................................... 8
SESSION 1.3: THESIS SYNOPSES ...................................................................................................... 10
1.3.1 Introduction to Thesis Synopses ................................................................................................ 10
1.3.2 Structure of Thesis Synopses ..................................................................................................... 10
Unit 2 ....................................................................................................................................................... 14
2.1.1 Motivation for Research in Engineering and some basic concepts ............................................ 16
xii

2.1.2 Classification of Engineering Experiments ................................................................................ 17


2.1.3 Research Questions in Engineering ........................................................................................... 17
2.1.4 Experiment Design Process ....................................................................................................... 17
2.2.1 Sources of Error in Experimental Work..................................................................................... 21
2.2.1.1 Instrumental Errors - A Closer Look .................................................................................. 22
2.2.2 Estimating Uncertainties ............................................................................................................ 23
2.3.1 Probability Distributions and Standard Errors ............................................................................... 28
2.3.2 Properties of Probability Density Function ................................................................................ 29
2.3.3 Mean and Variance .................................................................................................................... 30
2.3.4 The Normal Distribution (also called the bell-curve) ................................................................ 32
2.3.5 Skewed Distributions ................................................................................................................. 33
2.3.6 Standardization and Z-Scores .................................................................................................... 34
2.4.1 Error of the Mean ....................................................................................................................... 38
2.4.2 Central Limit Theorem............................................................................................................... 40
2.4.3 The t-distribution........................................................................................................................ 40
2.5.1 Part 1 Examples in Normal Distributions and z-Scores .......................................................... 43
2.5.2 Part 2 Applying Normal Distribution to Engineering Problems ............................................. 46
Unit 3 ....................................................................................................................................................... 49
SOCIAL SCIENCE RESEARCH DESIGN AND DATA ANALYSIS .................................. 49
3.2.2 Purpose of Case Studies ................................................................................................................. 50
SESSION 3.3 OTHER TYPES OF RESEARCH DESIGNS ............................................................. 50
SESSION 3.4 RESEARCH ETHICS.......................................................................................................... 50
3.4.2 Balancing Costs and benefits in Research ..................................................................................... 50
3.4.3 Informed Consent........................................................................................................................... 50
3.4.4 Competence.................................................................................................................................... 50
xiii

3.4.5 Privacy ........................................................................................................................................... 50


Merits of Structured questions ................................................................................................................ 56
Demerits of Structured questions ............................................................................................................ 56
Merits of Structured questions ................................................................................................................ 57
Demerits of open-ended questions .............................................................................................................. 57
c) Contingency questions ............................................................................................................................ 57
Characteristics of a Good Sample Design .......................................................................................... 62
Advantages.............................................................................................................................................. 71
Disadvantages ......................................................................................................................................... 71
3.2.2 Purpose of Case Studies ................................................................................................................. 79
SESSION 3.3 OTHER TYPES OF RESEARCH DESIGN ............................................................... 82
3.3.1.1 Purpose of Observational Research ............................................................................................... 83
3.3.1.2 Steps in carrying out Observational Research................................................................................ 83
3.3.1.3 Types of Observational Research ............................................................................................... 83
3.3.1.4 Limitations of Observational Research ....................................................................................... 83
3.3.2.1 Steps in carrying out ethnographic studies ................................................................................. 84
3.3.2.2 Advantages.................................................................................................................................. 84
3.3.3.1 Purpose of Historical Research ................................................................................................... 86
3.3.3.2 Steps in conducting historical research ....................................................................................... 86
3.3.3.3 Limitations of historical research ................................................................................................ 87
SESSION 3.4 RESEARCH ETHICS.......................................................................................................... 91
3.4.2 Balancing Costs and benefits in Research ..................................................................................... 91
3.4.3 Informed Consent........................................................................................................................... 92
3.4.4 Competence.................................................................................................................................... 92
3.4.5 Privacy ........................................................................................................................................... 93
xiv

Unit 4 ....................................................................................................................................................... 97
STATISTICAL ANALYSIS WITH STATA AND SPSS .......................................................... 97
SESSION 4.1: INTRODUCTION TO SPSS .......................................................................................... 98
4.1.1 The Nature of SPSS ................................................................................................................... 98
4.1.2 Data Management .................................................................................................................... 104
4.1.3 Descriptive Statistics ................................................................................................................ 113
SESSION 4.2: INTRODUCTION TO STATA .................................................................................... 124
4.2.1 The Stata Environment............................................................................................................. 125
4.2.2 Data Management .................................................................................................................... 130
4.2.3 Descriptive Statistics In Stata .................................................................................................. 136
Unit 5 ..................................................................................................................................................... 145
INTRODUCTION TO JOURNAL ARTICLES, CONFERENCE PAPERS AND
THESES WRITING ............................................................................................................................ 145
SESSION 5.1: RESEARCH AND THESIS REPORTS ....................................................................... 146
5.1.1 Thesis Report Writing .............................................................................................................. 146
5.1.2 Research Report Writing.......................................................................................................... 148
SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER PREPARATION ................. 150
SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND REFERENCING .............. 152
5.3.1 Abstracts and Summaries ......................................................................................................... 152
5.3.2 Tables and Figures ................................................................................................................... 152
5.3.2 Referencing .............................................................................................................................. 152
5.3.3 Referencing Formats ................................................................................................................ 153
5.3.4 Introduction to referencing software packages ........................................................................ 156
COURSE SUMMARY ....................................................................................................................... 159
APPENDIX A1..................................................................................................................................... 160
xv

APPENDIX A2..................................................................................................................................... 167


APPENDIX B ....................................................................................................................................... 175

xvi

List of Tables
Table 1.1: Typical Structure of a Logical Framework .................................................................................. 7
Table 1.2: Example of a Research Budget .................................................................................................... 8
Table 2.1: Determining the average, average deviation and standard deviation......................................... 23
Table 2.2: Basic rules in error propagation ................................................................................................. 26
Table 3.1: Advantages and Disadvantages of the Interview Methods ........................................................ 53
Table 3.2: Advantage sand Disadvantages of Open-ended and Close questions ........................................ 59
Table 3.3: Sampling techniques: Advantages and disadvantages ............................................................... 71

List of Figures
Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel ............................... 20
Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel ................ 21
Figure 2.3: Determining Instrumental Limits of Error and Least Count .................................................... 23
Figure 2.4: Plot of f(x) vs X ........................................................................................................................ 29
Figure 2.5: samples are drawn from populations ........................................................................................ 30
Figure 2.6: the normal distribution is bell-shaped. ..................................................................................... 32
Figure 2.7: Positively skewed distribution .................................................................................................. 33
Figure 2.8: Negatively skewed distribution ................................................................................................ 34
Figure 2.9: Interpreting the z-score ............................................................................................................. 35
Figure 2.10: distribution of the means of the samples ................................................................................ 39
Figure 2.11: Average difference between expected value and sample mean.............................................. 40

xvii

Unit 1

INTRODUCTION TO RESEARCH PROPOSALS AND


THESIS SYNOPSES PREPARATION
Introduction
This unit seeks to introduce students to the basic preparations preceding a research project. This
includes the preparation of concept notes, which are mostly directed towards donor/funding
agencies; research proposals, which are the full proposals stating the need for the research as
well as the expected results and thesis synopsis which are basically research proposals but
specifically for academic purposes.

Learning Objectives
After reading this unit you should be able to:
1.

Write a concept note capable of securing funding


for a research project specifically your masters
research project.
2. Prepare a research proposal as well as thesis
synopsis for your masters Thesis.
UNIT CONTENT
SESSION 1.1: CONCEPT NOTES
1.1.1 Introduction to Concept Notes
1.1.2 Structure of Concept Notes
SESSION 1.2: RESEARCH PROPOSALS
1.2.1 Introduction to Research Proposals
1.2.2 Structure of research proposals
1.2.3 Logical framework
1.2.4 Detailed Budget
SESSION 1.3: THESIS SYNOPSES
1.3.1 Introduction to Thesis Synopses
1.3.2 Structure of Thesis Synopses
1

SESSION 1.1: CONCEPT NOTES


1.1.1 Introduction to Concept Notes
A concept note is a brief summary of a proposed research project, usually prepared for
donors or sponsors. It should not be more than 550 words or 3 to 7 pages. It should outline
the background to the project and state the research problem to be investigated. It should
also give the objectives and the methodology to be used for the research and spell out the
timeframe as well as a summary of budget for the project.

1.1.2 Structure of Concept Notes


1.1.2.1 Research Title
The title of the research should be concise and should focus the readers attention to the
critical theme of the proposed research. It should be short, usually not more than one line
in length and devoid of unnecessary punctuations as well as repetition of words.
1.1.2.2 Background
This contains a review of the main research work and current issues specific to the
subject area. It should also contain what is already known about the research subject. It is
important to note that, the background is not the same as the literature review with the
latter not necessary for concept notes. It is usually about 200 words in length.
1.1.2.3 Research Problem
This section should outline clearly without ambiguity the research problem to be
investigated. It shouldnt be more than 200 words.

1.1.2.4 Objectives
The main objectives as well as the specific objectives of the proposed research should be
clearly outlined in this section.
1.1.2.5 Methodology
This section outlines clearly the proposed methodology to be used for the research work.
It spells out exactly how the research work will be carried out and the procedures
involved. It should usually be about 100 words in length.
1.1.2.6 Project location, timeframe and budget
2

The proposed site for the research, time frame for the completion of the research as well
as a summary of the budget should clearly be outlined in this section.

Self Assessment 1.1


1. What is the purpose of a concept note?
2. What are the major components of a concept note?

SESSION 1.2: RESEARCH PROPOSALS


1.2.1 Introduction to Research Proposals
A research proposal as defined by the School of Advanced Study- University of London, is
a piece of work that, ideally, would convince scholars that your project has the following
three merits: conceptual innovation; methodological rigour; and rich substantive content.
A research proposal is therefore supposed to;
Provide a logical presentation of the research idea
Illustrate the significance of the idea
Relate the idea to past literature
Outline the activities for the proposed research project
A Research proposal may be written for any of the following reasons; to request funding
for a research project, as a task in tertiary education (in which case it is referred to as a
thesis synopsis), or as a condition for employment at a research institution.

1.2.2 Structure of research proposals


The structure of most research proposals include a title, introduction and background,
statement and significance of research problem, research objectives, literature review,
methodology and hypotheses, expected or preliminary results, researchers details,
timetable, detailed budget and finally references.

1.2.2.1 Title
3

The title must be succinct and should give the reader an overview of what to expect in the
main document. It should be on the first page of the proposal, short with not more than 20
words and should be devoid of unnecessary punctuations and repetition of keywords.

1.2.2.2 Abstract
The abstract is a concise summary of the main points of the proposal and should be kept as
short as possible without leaving out any important point. It should be a maximum of 500
words.

1.2.2.3 Background
This contains a summary of the background information to the research problem and the
context within which the study will take place. It draws a relation between the study,
research idea and the policy environment. It should contain what is already known about
the research area and how the research will compliment what is already known. It is made
up of a maximum of 1,000 words.

1.2.2.4 Statement and Significance of Research Problem


It is important to state clearly the research problem and the significance of the research to
the community. This section of the proposal should be able to answer questions such as;
what is going to be studied/investigated? Why is it important to subject this subject? It
shouldnt be more than 250 words in length.

1.2.2.5 Research Objectives


It is important to outline the key objective(s) of the research which spells out what the
researcher seeks to accomplish. A single principal objective with two or three specific
objectives is usually enough. They should be listed in order of importance and should be a
maximum of 200 words.

1.2.2.6 Literature Review


The Literature review should provide a brief description of available literature in terms of
research works done, policy statements and their implications as well as the identification
4

of shortfalls to be studied and complimented. This section should indicate how existing
literature contributes to the proposed research and how the proposed research is also going
to add to existing work. It should be a maximum of 3,000 words.

1.2.2.7 Methodology
A methodology is a system of methods and principles employed in performing specific
tasks; in this case a research project. The main research techniques to be used for the
project must be described in details in this section. The methodology should also be able to
answers questions such as will the study be based on existing information, interviews or a
combination of both? This section should also give a thorough description of the data
required, the nature of the fieldwork to be undertaken and how the data collected will be
analyzed. In the case of a survey using questionnaires, the sampling procedure as well as
approximate sample size should be stated (a draft questionnaire may be included). It is
important to state clearly how the data collected will address the research question. It
should be a maximum of 1,000 words.

1.2.2.8 Expected/Preliminary Results


This section gives a good indication of what is expected out of the research. It joins the
data analysis and possible outcomes to the theory and questions that have been raised. It
should include the following;

Scope of inference (i.e., to what extent are the results applicable to other locations,
times, or situations?)
Pitfalls that may be encountered
Limitations to proposed methods

1.2.3 Logical framework


Logical framework is a tool that helps in the planning, monitoring and evaluation of
project. It is an effective planning tool for defining inputs, outputs, timelines as well as
performance indicators for a particular project. It provides a structure for specifying the
components of an activity and for relating them to one another. It has the power to
communicate a project's objectives clearly and simply on a single page as well as the
ability to incorporate the full range of views of all stakeholders of a project.
The logical framework is presented in the form of a 4x4-matrix in which the overall
objectives, the project purpose, the mid-term results, and the activities of a project are
systematically presented in the first column of the matrix. The second and third columns
of the matrix present the corresponding indicators and their sources of information while
5

the fourth column presents important assumptions that are beyond the direct control of the
project but need to be fulfilled in order to ensure a successful implementation of the
project. A logical framework can only done after a thorough analysis of problems,
objectives and strategies to be employed in the project. Table 1.1 shows a typical structure
of a logical framework.

Table 1.1: Typical Structure of a Logical Framework


Intervention
logic

Objectively verifiable
indicators of achievement
What are the key indicators
related

Overall

What are the overall broader

objectives

objectives to which the action


will contribute?

to the overall objectives?

Specific

What specific objective is the

Which indicators clearly show

objective

action intended to achieve to

that the objective of the

contribute to the overall objectives?

action has been achieved?

What are the indicators to


measure

Expected

The results are the outputs envisaged to

results

achieve the specific objective.


What are the expected results?
(enumerate them)
What are the key activities to be carried
out
and in what sequence in order to
produce

whether and to what extent the


action achieves the expected
results?

the expected results?


(group the activities by result)

Activities

Sources and means of


verification

Assumptions

What are the sources of


information for these
indicators?
What are the sources of
information that exist or can
be
collected? What are the
methods
required to get this
information?

Which factors and conditions outside

What are the sources of


information for these
indicators?

What external conditions must be met

Means:

What are the sources of

What are the means required to

information about action

implement these activities, e. g.


personnel, equipment, training,
studies, supplies, operational
facilities, etc.

progress?
Costs
What are the action costs?
How are they classified?
(breakdown in the Budget
for the Action)

the Beneficiary's responsibility


are necessary to achieve that
objective? (external conditions)
Which risks should be taken
into consideration?

to obtain the expected results


on schedule?
What pre-conditions are required
before
the action starts?
What conditions outside the
Beneficiary's
direct control have to be met
for the implementation of the planned
activities?

Source: The European Commission


7

1.2.4 Detailed Budget


A detailed budget is an itemized list accounting for every expense required to complete the
project. Itemized budgets are essential even if the granting agency does not require the
submission of a detailed budget (Ingersoll&Eberhard, 1999). It is important to note that the
researcher could easily overestimate or underestimate the cost of completing the study if
serious considerations are not given to all potential expenses. Some of the items included
in a research proposal budget can be divided roughly into the following categories:
personnel, consultation, subcontracts, equipment and supplies, travel, facilities,
administration costs and miscellaneous costs. Table 1.2 shows the typical structure of a
detailed research budget.

Table 1.2: Example of a Research Budget


Costs

Unit

# of units

Unit
rate

Costs

1. Human Resources
1.1 Salaries (gross salaries including social security charges and
other related costs, local staff)4
1.2 Salaries (gross salaries including social security
charges and other related costs, expat/int. staff)
1.3 Per diems for missions/travel

Per month

Subtotal Human Resources


2. Travel
2.1. International travel
2.2 Local transportation

Per flight
Per month

Subtotal Travel
3. Equipment and supplies7
3.1 Purchase or rent of vehicles
3.2 Furniture, computer equipment
3.3 Machines, tools
3.4 Spare parts/equipment for machines, tools
3.5 Other (please specify)

Per vehicle

Subtotal Equipment and supplies


4. Local office
4.1 Vehicle costs
4.2 Office rent
4.3 Consumables - office supplies
4.4 Other services (tel/fax, electricity/heating, maintenance)

Per month
Per month
Per month
Per month

Subtotal Local office


5. Other costs, services8
5.1 Publications9
5.2 Studies, research9
5.3 Expenditure verification
5.4 Evaluation costs
5.5 Translation, interpreters
5.6 Financial services (bank guarantee costs etc.)

5.7 Costs of conferences/seminars9


5.8. Visibility actions10
Subtotal Other costs, services
6. Other
Subtotal Other
7. Subtotal direct eligible costs of the Action (1-6) (excluding
taxes)
8. Provision for contingency reserve (maximum 5% of 7, subtotal
of direct eligible costs of the Action) (excluding taxes)
9. Total direct eligible costs of the Action (7+ 8) (excluding
taxes)
10. Administrative costs (maximum 7% of 9, total direct eligible
costs of the Action) (excluding taxes)
11. Total eligible costs (9+10) (excluding taxes)
12. Taxes11
13. Total eligible/accepted12 costs of the Action (11+12)

Self Assessment 1.2


1. What is the difference between a concept note and a proposal?
2. Explain the concept of a logical framework and outline its importance.

SESSION 1.3: THESIS SYNOPSES


1.3.1 Introduction to Thesis Synopses
A thesis synopsis is an academic research proposal which should establish the area of the
research project, define clearly the central research question, and outline the methods to be
employed for the research. It should be developed in consultation with members of staff
(such as a proposed supervisor or the school postgraduate coordinator). It is important to
note that the initial ideas for the research could be rened during the course of the study.

1.3.2 Structure of Thesis Synopses


1.3.2.1 Title
The title must be concise and should give the reader an overview of what to expect in the
main document. It should be on the first page and must be the same as the title of the
thesis. It should be short with not more than 20 words and should be devoid of unnecessary
punctuations and repetition of keywords.

1.3.2.2 Introduction/Background
Outline briefly the relevance of the research work to be presented in the thesis in this
section. The introduction should be precise and include only relevant background
material in that particular field of study. It is important to provide information on past
works, by other researchers, by way of giving appropriate references. Maximum one
page, preferably half a page is allotted to this section.

1.3.2.3 Justification/Motivation
This section develops further the introductory/background materials provided in the
introduction; adding some of the major achievements made in the chosen area of
research. It should clearly indicate the existing challenges and why further research is
required to address those challenges. It is very necessary to stress on the importance of
the research problem identified as well as the technical challenges one has to address to
solve the problem so as to emphasis on the quality of the research work. Maximum one
page, preferably half a page is allotted to this section.

1.3.2.4 Objectives and Scope


10

State clearly the main as well as the specific objectives for the research and define the
conceptual, analytical, experimental and/or methodological boundaries within which the
exercise should be carried out.

1.3.2.5 Methods
It is important to outline how you will approach your research topic. One should
demonstrate, in this section, that the chosen method or approach will serve to advance the
thesis. If you need to gather data, describe how you will go about this. This might involve
archival research, interviews with stakeholders, or various forms of eldwork. There are
many established research methodologies. If your approach is experimental or
comparative, outline how this approach will yield results.

1.3.2.6 Work plan / Project Timelines


A project plan outlines in specific detail how a project will be conducted, who will work
on which part, and when and in what order each part will be accomplished. Develop this
section with some care, since it will provide you a means of measuring your progress in
relation to your allotted time. This section should detail the timing of specific activities to
be implemented towards the achievement of the specific objectives within a reasonable

1.3.2.7 Budget and Available Resources


It is important to present the full budget as well the various resources available for the
research work in this section. This section should indicate any bibliographic, laboratory,
computing or other physical resources required to execute the study and a budget for
projected expenditures including stipend/allowances where needed

1.3.2.8 References
List the references in the same order as they are referred to in the synopsis make sure all
references listed here are properly referred in the text. It is best to get into the habit of
using a standard referencing system (preferably in conformity with the Harvard System)
so that material can be transferred into your thesis. Do not cite from memory without
referencing.

1.3.2.10 Signature(s)
11

It is very important for signatures attesting to the fact that your proposed Supervisor(s) is
(are) in agreement with your proposed study as elaborated in the synopsis.

Self Assessment 1.3

1. How different is a thesis synopsis from a research proposal?

Learning Track Activities

Unit Summary

Concept notes, research proposals and thesis synopsis are the first things that come to mind when
one thinks of a research work. These documents give various levels of information about the
research work and are mostly intended for different stakeholders. This chapter introduces
students to the preparation of concept notes, which are mostly directed towards donor/funding
agencies; research proposals, which are the full proposals stating the need for the research as
well as the expected results and thesis synopsis which are basically research proposals but
specifically for academic purposes. It is intended that at the end of this unit, the student should
be able to write a concept note capable of securing funding for a research project specifically
your masters research project and also prepare a research proposal as well as thesis synopsis for
your masters Thesis.

1.
2.
3.
4.

Key terms/ New Words in Unit

Thesis
Synopsis
Proposal
Logical framework
12

Unit Assignments 1
1. Prepare a zero-order draft of your thesis synopsis in Power-Point format for
presentation to the class.
2. The presentation should last no more than 10 minutes to be followed by
another 10 minutes of questions and answers. This assignment will fetch 5
marks.
3. Following the presentation you will be required to do a first-draft of your
thesis synopsis (word-processed) for submission within one week. The draft
synopsis will also fetch 5 marks.

13

Unit 2
ENGINEERING RESEARCH DESIGN AND DATA ANALYSIS

Introduction
This Unit introduces the student to concepts and methods in engineering research. The first
section (2.1) presents various contexts in engineering practice which necessitate research and
classifies experiments that may be undertaken as part of the research. Procedures for the design
of experiments are also presented.
Section 2.2 is on experimental errors, and catalogues various sources from which errors can be
introduced into our experimental work. Students are also presented with tools for the analysis of
such errors and how they are propagated as measurements are repeated and computations are
done.
Section 2.3 looks at probability distributions and standard errors. In this section the student is
introduced to probability density functions and their common features. The normal distribution
(the most widely used) is then discussed along with the concept standard scores and the
procedure procedures for its application. The Poisson and Binomial probability distributions are
also briefly presented to conclude the section.
Section 2.4 is on standard errors and considers Errors of the Mean, the Central Limit Theorem
and the t-Distribution.
The Unit concludes with Section 2.5, on examples in normal distributions and their application to
engineering problems.

Learning Objectives
After reading this unit you should be able to:
3. Clearly understand the importance of research in engineering,
4. Establish methodology for engineering experiments,
5. Identify sources of error in experiments and be able to minimize or
eliminate them.
6. Report inherent errors in experimental measurements
14

7. Analyze engineering data using the normal probability curve

Unit content
SESSION 2.1: INTRODUCTION TO ENGINEERING RESEARCH
2.1.1 Motivation for Research in Engineering and some basic concepts
2.1.2 Classification of Engineering Experiments
2.1.3 Research Questions in Engineering
2.1.4 Experiment Design Process
SESSION 2.2: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS
2.2.1 Sources of Error in Experimental Work
2.2.2 Estimating Uncertainties

SESSION 2.3: PROBABILITY DISTRIBUTIONS AND STANDARD ERRORS


2.3.1 Properties of Probability Density Function
2.3.2 Mean and Variance
2.3.3 The Normal Distribution
2.3.4 Skewed Distributions
2.3.4 Standardization and Z-Scores
SESSION 2.4: STANDARD ERRORS
2.4.1 Error of the Mean
2.4.2 Central Limit Theorem
2.4.3 The t-distribution
SESSION 2.5: EXAMPLES IN NORMAL DISTRIBUTIONS APPLIED TO
ENGINEERING PROBLEMS
2.5.1 Part 1 Examples in Normal Distributions and z-Scores
15

2.5.2 Part 2 Applying Normal Distribution to Engineering Problems

SESSION 2.1: INTRODUCTION TO ENGINEERING RESEARCH


2.1.1 Motivation for Research in Engineering and some basic concepts
Research in engineering is necessitated by factors which include either an advantage which could
be realized by improving on an existing technology, (e.g. an existing drilling machine) or to
address a problem.
More formally, engineering research may be described as a systematic, rigorous approach to
engineering problem-solving that applies principles and techniques to collect data, to ensure the
generation of valid, defensible and supportable engineering conclusions. This is usually carried
out under the constraint of a minimal expenditure of engineering runs, time, and money.1
To guarantee the integrity of the research process and to obtain high quality results and usable
conclusion, a number of practices are recommended below:
Following the standards of the scientific method
Purpose clearly defined
Research process detailed (for replicability by others)
Research design thoroughly planned
High ethical standards applied
Limitations frankly revealed
Adequate analysis for decision makers needs
Findings succinctly presented
Conclusions must reflect research objectives

(US-NIST- National Institute of Standards and Technology)

16

2.1.2 Classification of Engineering Experiments


As part of research in engineering, experiments may be conducted, which for one or more of the
following reasons:

A theoretical relationship between two or more variables is already known (or at least
suspected) and an experiment is needed to verify or quantify this relationship.
A theoretical relationship between two or more variables is not available but rather
sought through an experiment.
A new product is being developed and a test is needed to confirm that it meets the design
specifications, before committing it to production.

2.1.3 Research Questions in Engineering


The engineer is interested in assessing whether a change in a single factor has in fact resulted in a
change/improvement to the process as a whole.
The engineer is interested in "understanding" the process as a whole in the sense that he/she
wishes (after design and analysis) to have in hand a ranked list of important through unimportant
factors (most important to least important) that affect the process.
The engineer is interested in functionally modeling the process with the output being a goodfitting (high predictive power) mathematical function, and to have good estimates (maximal
accuracy) of the coefficients in that function.
The engineer is interested in determining optimal settings of the process factors; that is, to
determine for each factor the level of the factor that optimizes the process response.

2.1.4 Experiment Design Process


In conducting experiments in engineering research, the following procedure is prescribed to
assist the researcher obtain valid and defensible conclusions:
1.
2.
3.
4.
5.
6.

Scientific/Engineering Concept
Questions Posed
Equipment /Materials
Design of Procedure
Analysis of Results
Conclusions

The procedure prescribed above may be expanded further into a flow process for the design of
experiments as presented below:
17

Process Flow for Design of Experiments


1. Define the goals and objectives of the experiment. While the goal may be general, the
objectives need to be more specific and measurable, directly or indirectly;
2. Research any relevant theory and previously published data from similar experiments.
Performing computer simulations may also be part of this research, assuming that
appropriate software is available. The purpose of this step is to have an idea about what
to expect from the experiment;
3. Select the dependent and independent variable(s) to be measured;
4. Select appropriate methods for measuring these variables;
5. Choose appropriate equipment and instrumentation;
6. Select the proper range of the independent variable(s);
7. Determine an appropriate number of data points needed for each type of measurement;
8. Data analysis and reporting - qualitative analysis and quantitative analysis.
Additional Skills
In addition to the steps outlined above, the researcher must be careful to:
1. Familiarize himself/herself with the equipment to be used;
2. Ensure that instruments are properly calibrated;
3. 3. Follow the proper procedure to collect the data and / or measure the performance of the
product, e.g. reading from the meniscus in volume measurements.
Analyzing and interpreting data constitutes an important component of research, and the
researcher should be able to:
1. Carry out the necessary calculations;
2. Perform an error and uncertainty analysis;
3. Tabulate and plot the results using appropriate choice of variables and software
(such as STATA, SPSS, Microsoft Excel, etc)
4. Make observations and draw conclusions regarding the variation of the
parameters involved;
5. Compare results with predictions from theory or design calculations and attempt
an explanation of any discrepancies observed.
EXAMPLE
A student is tasked to investigate and compare the power output of a solar panel with a fixed
orientation to that of a solar panel whose orientation tracked the sun. He also tried to verify that
the power output of a photovoltaic cell was a function of temperature.
METHODOLOGY
18

1. Define goals and objectives:


The goals and objectives for the experiment were to verify that:

A logarithmic relationship exists between angle of incidence of sunlight on a solar panel


and power output;
A tracking system increases power output by 20%; and
The power output of a solar cell is a function of temperature
2. Research relevant theory and previously published data:
The student investigated various sources of information in designing his experiment:
Internet resources, textbooks, interview with experts in solar systems, etc
3. Select the independent / dependent variables:
The key variables were identified to be:
Angle of orientation of solar panel (independent), and
Output power of solar panel (dependent)
4. Select appropriate methods:
The student chose a direct method for measuring the angle of the solar panel and measured
voltage and current to determine power output of the solar panel.
5. Choose equipment and instrumentation:
The student used a camera tripod, protractor, and plumb bob to orient and determine the angle of
the solar panel for the fixed panel measurements, and a sundial2 rod to orient the panel normal to
the suns rays for the tracking measurements; a digital multimeter to measure current and
voltage; and a thermometer to measure the temperature of the solar panel.
6. Select the range of the independent variable:
The tripod allowed a 55-degree range of motion and this set the range for the angle of incidence.
For the tracking measurements, the range of measurements took place from 6:45 am 6:00 pm.
The student was limited by the available resources for investigating the effect of temperature to
that obtainable under ambient conditions and by cooling the solar panel using ice cubes.
7. Determine the appropriate number of data points:

A sundial is a device that determines the time of day by the position of the Sun.
19

To investigate the logarithmic relationship between angle of incidence and power output, the
student chose 5-degree increments, which resulted in 12 data points.
For the tracking measurements, the student reoriented the panel to be normal to the suns rays
using the following schedule:
15 min intervals 6:45 am 10:00 am
30 min intervals 10:30 am 4:00 pm
30 min intervals 10:30 am 4:00 pm

Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel

20

Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel

Self Assessment 2.1

It is claimed that a 10% blend of biodiesel with conventional diesel improves the emissions characteristics
of engines. Design an experiment to investigate the veracity of this claim.

SESSION 2.2: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS


2.2.1 Sources of Error in Experimental Work
In conducting experiments errors may arise from a number of sources including:

Blunders (mistakes) - e.g. dropping a solid on the balance pan;


Human error (different from blunders) - Bothers more on inexperience, e.g. not
reading from the meniscus of a volumetric cylinder.
21

Instrumental limitations inherent errors and limitations in instruments used


(discussed later in this section)
Errors due to external influences e.g. impurity in the chemicals used. This could
be minimized with careful design of experiment.
Sampling Error - Errors arising out of samples that do not adequately represent the
population.
o Example 1 - in measuring solar radiation, data taken at peak sunshine hours
could be misleading (unrepresentative).
o Example 2 in measuring pollution level in a river, different pollutants
dominate depending on time of day, if this is not taken into account, samples
taken for analysis will be unrepresentative of the reality.

2.2.1.1 Instrumental Errors - A Closer Look


Random Errors occur due to inherent limitations of measuring instrument used. The smallest
division that is marked on a measuring instrument is referred to as the least count. Thus a meter
rule will have a least count of 1.0 mm; a digital stop watch might have a least count of 0.01 sec,
etc. The precision to which a measuring device can be read, and is always equal to or smaller
than the least count.
Instrument Limit of Error (ILE): Good measuring tools are calibrated against national and
international standards, e.g. ISO, IEC, National Institute of Standards and Technology-(US
NIST), Ghana Standards Board, etc.
The Instrumental Limit of Error (ILE) is generally taken to be the least count or some fraction
(1/2, 1/5, 1/10) of the least count. For some devices the ILE is given as a tolerance or a
percentage.
Resistors may be specified as having a tolerance of 5%, implying that the ILE is 5% of the
resistor's value.

22

Figure 2.3: Determining Instrumental Limits of Error and Least Count

2.2.2 Estimating Uncertainties


The statistical method for finding a value with its uncertainty is to repeat the measurement
several times, find the average, and find either the average deviation or the standard
deviation. The example below is presented for 4 repeated measurements of time (7.4, 8.1, 7.9
and 7.0).
Table 2.3: Determining the average, average deviation and standard deviation
2

Time, t, sec

(t - <t>), sec

|t - <t>|, sec

(t - <t>)

7.4

-0.2

0.2

0.04

8.1

0.5

0.5

0.25

7.9

0.3

0.3

0.09

7.0

-0.6

0.6

<t> = 7.6
Average

<t-<t>>= 0.0

<|t-<t>|>= 0.4
Average deviation

0.36
2

(t - <t>) = 0.247
Standard dev = 0.50

Measurements are then reported with the uncertainty as:


Measurement = Best Estimate Uncertainty
The average (mean) value is usually taken as the best estimate, and is determined as:
23

1 + 2 + 3 + +

Where N is the number of observations or measurements


A way to express the variation among the measurements is to use the average deviation. This
statistic tells us on average (with 50% confidence) how much the individual measurements vary
from the mean. As indicated above, the average deviation is calculated by summing the absolute
values of the deviation of measurements from the mean, and dividing by the number of
observations.

, | | =

|1 | + |2 | + + | |

However, the standard deviation is the most common way to characterize the spread of a data
set. The standard deviation is always slightly greater than the average deviation, and is used
because of its association with the normal distribution that is frequently encountered in statistical
analyses.

|1 |2 + |2 |2 + + | |2
1

In the example above (section 3.2.2), the standard deviation of 0.5 implies that for the same
series of measurements, an additional measurement taken may be expected (with about 68%
confidence) to lie within 0.5 of the average value of 7.6 sec.

Fractional Uncertainty
When a reported value is determined by taking the average of a set of independent readings, the
fractional uncertainty is given by the ratio of the uncertainty divided by the average value.
=

The fractional uncertainty is dimensionless, and sometimes reported as a fraction.


Propagation of Errors- Basic Rules
24

General theory:
Suppose we want to determine a quantity f which depends on variables x, y ... etc.
f(x,y,...)

= + +...
Taking the square of the above expression, we get the law of propagation of uncertainty:
2
2

()2 = ( ) ()2 + ( ) ()2 + 2 ( ) ( )


If the measurements of x and y are uncorrelated, then = 0 and the error in the function f
may be approximated as:
2
2
2

= ( ) () + ( ) ()2

Where .
Examples:
a) If
=+

= 1,
=1

= ()2 + ()2
b) If
=

= ,
=

= ()2 ()2 + ()2 ()2


Dividing by the function
=
We obtain

2
2
= ( ) + ( )

25

c) If
= /
1

= ,
= 2

1 2
2
= ( ) ()2 + ( 2 ) ()2

Dividing by the function


= /
We obtain

2
2

= ( ) +( )

Therefore the uncertainty in the function f is the same for both multiplication and division. Note
that unlike the sums, this is always written as fractional errors for dimensional consistency.
By a similar process the error in a function of the form
=
May be expressed as:

2
2

= (
) +(
)

A summary of some of the basic rules is presented in the table below:

Table 4.2: Basic rules in error propagation


Operation

Example

Addition

S = A+B

Error

S A2 B 2 .
26

Subtraction

D = A-B

D A2 B 2
Multiplication

P=Ax B

Division

Q =A/ B

P
A B

P
A B
2

Q
A B

Q
A B
2

For equations involving mixtures of multiplication, division, addition, subtraction, and powers;
the same basic rules are applied systematically to evaluate the error contained in the dependent
variable as a result of errors in the independent variables.
Example
In an experiment to determine the enthalpy of neutralization of sodium hydroxide by
hydrochloric acid, the initial temperature was (19.2 0.2) oC, and the final temperature (26.4
0.2) oC. What is the temperature rise?
Solution: T = (T2 T1) T;
(26.4 19.2) oC T
=7.2 oC T
The error T is given by:
= (1 )2 + (2 )2
= (0.2)2 + (0.2)2
= 0.28o C
T = (7.2 0.28) oC

Self Assessment 2.2

Calculate z and z for each of the following cases:


27

1. = . + for = (. . )m, = (. . ) m

2. = ( ) for = (. . ) m/s 2 , = (. . ) m, =
( ) m/s
3. = sin for = (. . ) m/s, = (. . ) rad.

SESSION 2.3: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS


2.3.1 Probability Distributions and Standard Errors
Probability distribution is a function that describes the probability of a random variable 3taking
certain
values.
In
more
precise
definitions,
distinction
is
made
between discrete and continuous random variables.
A random variable is called continuous if it can assume all possible values in the possible
range of the random variable. Suppose the temperature in a certain city in the month of
June in the past many years has always been between 35o to 45 o centigrade. The
temperature can take any value between the ranges 35o to 45 o.
In discrete random variable the values of the variable are exact like 0, 1, 2 good bulbs. the
interval may be very small.
(emathzone, 2012)
The probability function of the continuous random variable is called probability density
function.
It is denoted by ();
Where () is the probability that the random variables X and takes the value between and +
where is a very small change in X.

A random variable is a numerical variable whose measured value can change from one replicate experiment to
another.

28

Figure 2.4: Plot of f(x) vs X


Credit: (emathzone, 2012)
The probability that X is between a and b is determined as the integral of () from a to b, and
is expressed mathematically as:

( < < ) = ()

2.3.2 Properties of Probability Density Function


The probability density function () has the following properties.
The function is non-negative for all values of ; () 0

2. The total area = f(x)dx = 1

c
3. (X = c) = f(x)dx = 0; where c is a constant4. A probability of zero is assigned to each
c
point of the random variable. This means that we must calculate a probability for a
continuous random variable over an internal and not for any particular point. The
probability can be interpreted as an area under the graph between the interval from a to b.
1.

4.

If X is a continuous random variable, then for any a to b,


( ) = ( < ) = ( < ) = ( < < )

The probability of a continuous random variable assuming a specific value is zero. This does not necessarily mean
that a particular value cannot occur. The interpretation is that the point (event) is one of an infinite number of
possible outcomes.

29

2.3.3 Mean and Variance


Important parameters in presenting probability distributions include the mean (arguably the most
popular statistical parameter), the variance and standard deviation. These parameters could be
based on the population (N) or on a sample of the population (n), see figure 2.5 below:

Figure 2.5: samples are drawn from populations

Variance

The mean may be determined as below:


=

( , )

( , )

Or

30

Where:
= ,
=
The variance is then determined as:
( )2
=

( )2
=
1
2

Where:
2 =
2 =
The standard deviation is then s2 or 2.

31

2.3.4 The Normal Distribution (also called the bell-curve)


The normal distribution is the most widely used model for the distribution of random variables
and helps in determining the probability of something occurring in a given sample just due to
chance. It is also called the bell-curve because of its resemblance to the shape of a bell (see
below, Fig 2.6).

Figure 2.6: the normal distribution is bell-shaped.


The normal distribution has three fundamental characteristics:

Symmetrical - upper half and the lower half of the distribution are mirror images of each
other.
Unimodal - the mean, median, and mode are all in the same place, in the center of the
distribution (i.e., the top of the bell curve); and the normal distribution is highest in the
middle.
Asymptotic - the upper and lower tails of the distribution never actually touch the
baseline, also known as the x-axis.

In a normal distribution, a random variable X has a probability density function is given by:
() =

1
2

()2
22

< <
32

Where;
< < , and > 0
The notation (, 2 ) is often used to denote a normal distribution with mean and variance 2.

2.3.5 Skewed Distributions


When a sample of scores is not normally distributed, two terms, skew and kurtosis, are used to
characterise it.
If there are a few scores creating an elongated tail at the higher end of the distribution, it is said
to be positively skewed (see Fig 2.7). If the tail is pulled out toward the lower end of the
distribution, the shape is called negatively skewed (see Fig 2.8).

Figure 2.7: Positively skewed distribution

33

Figure 2.8: Negatively skewed distribution


Kurtosis refers to the shape of the distribution in terms of height, or flatness. When a distribution
has a peak that is higher than that found in a normal, bell-shaped distribution, it is called
leptokurtic. When a distribution is flatter than a normal distribution, it is called platykurtic.

2.3.6 Standardization and Z-Scores


Using the mean and the standard deviation, researchers are able to generate a standard score,
also called a z score to help them understand where an individual score falls in relation to other
scores in the distribution.
A standard normal random variable is defined as a random variable with =0 and 2=1. It is
normally denoted as Z.

Through a process of standardization, researchers are also better able to compare individual
scores
in the distributions of variables. Standardization is simply a process of converting each score in
a distribution to a z score.
A z score indicates how far above or below the mean a given score in the distribution is in
standard deviation units. Standardization is simply the process of converting individual raw
scores in the distribution into standard deviation units.
The z-score is computed as indicated below, in terms the mean and standard deviation:
34

The 68-95-99.7% Rule


All normal probability density curves satisfy the following properties (see figure 2.9):

68% of the observations fall within 1 standard deviation of the mean,


i.e. ( < < ) = 0.6827

95% of the observations fall within 2 standard deviations of the mean,


i.e. ( 2 < < 2) = 0.9545

99.7% of the observations fall within 3 standard deviations of the mean,


i.e. ( < < ) = 0.9973

Figure 2.9: Interpreting the z-score


Interpreting z-Scores

z scores tell researchers instantly how large or small an individual score is relative to
other scores in the distribution.
Example, if a student got a z score of -1.5 in an exam, it is inferable that student scored
1.5 standard deviations below the mean in that exam.
If another student had a z score of 0.29, I would know the student scored 0.29 standard
deviation units above the mean in the exam.
35

Self Assessment 2.3

Quick Questions:
What does a z-score of 1.0 mean?
What DOES it say?

What it does NOT say?

Suppose that the average score of a student in an automobile engineering class is 517, with a
standard deviation of 100, and the distribution of scores is normal. What is the score that marks
the 90th percentile?
Remarks
Remember that the 90th percentile is 40 percentile points above the mean in a normal
distribution, so we are looking for the z score at which 40% of the distribution falls between the
mean and this z-score.
OR
The z score at which 10% of the distribution falls above, because the 90th percentile score
divides the distribution into sections with 90% of the score falling below this point and 10%
falling above
1. From traditional statistics tables5 , the z score that corresponds with the 90th percentile
(probability of 0.9) is 1.28.
So z = 1.28
These tables are developed using the probability function of a normal distribution.
2. Convert this z score back into the original unit of measurement
= + ()()

For cumulative standard normal distribution

36

= 517 + (1.28)(100)
= 517 + 128
= 645
2.3.7 Other Probability Distribution Functions
In addition to the normal distribution, other probability distributions include:

The binomial distribution which is used for the reporting of outcomes of random
experiments consisting of n repeated trials such that
o The trials are independent,
o Each trial results in only two possible outcomes, labeled as success and failure,
and
o The probability of a success on each trial, denoted as p, remains constant.

The random variable X that equals the number of trials that result in a success has a binomial
distribution with parameters p and n, where 0 < < 1, and = {1,2,3, }
The probability function of X is:

() = ( ) (1 ) , = 0, 1,

37

The mean and variance are determined as:


= () = and 2 = () = (1 )
The Poisson distribution:
The Poisson distribution is used to model the number of events over an interval, such as the
number of e-mails that arrive in an hour. Assuming events occur at random throughout the
interval. If the interval can be partitioned into subintervals of small enough length such that:

the probability of more than one count in a subinterval is zero,


the probability of one count in a subinterval is the same for all subintervals and
proportional to the length of the subinterval, and
the count in each subinterval is independent of other subintervals,

then the random experiment is called a Poisson process.


If the mean number of counts in the interval is > 0, the random variable X which is the
number of counts in the interval has a Poisson distribution with parameter , and the
probability function is:
() =


, = 0, 1, 2,
!

The mean and variance of X are


() = and V(X) =

SESSION 2.3: STANDARD ERRORS


The standard error is the measure of how much random variation we would expect from samples
of equal size drawn from the same population.

2.4.1 Error of the Mean


When samples are drawn from a given population, say the scores by students in an examination,
the samples will be characterized by their own means (sample means). As an example if 100
students score marks ranging from 2 to 10 in an examination in which 0 is the least and 10 is the
highest; we may at random draw 10 students from the population of 100. The scores of these 10
students will yield a mean of say 5.5. If the earlier 10 students are put back into the population
38

and another sampling of 10 students is done, their scores may yield another mean of say 6.0. If
this process is continued, a distribution of the means of the samples will be obtained, as indicated
in figure 2.10 below.

Figure 2.10: distribution of the means of the samples


The distribution of the means of the samples drawn also poses the characteristics of other
probability distributions, i.e. the mean and standard deviation. The mean of the sampling
distribution is called the expected value of the mean, because it is the same as the population
mean. The associated standard deviation (of the sampling distribution) is called the standard
error.
The standard errors of the mean are calculated as below:
=
=

Where;
= the standard deviation for the population
s = the sample estimate of the standard deviation (used when is not known)
n = the size of the sample
The standard error of the mean refers to the average difference between the expected value
(e.g., the population mean) and an individual sample mean as shown in Figure 2.11 below.
39

Figure 2.11: Average difference between expected value and sample mean

2.4.2 Central Limit Theorem


The Central Limit Theorem simply states that as long as you have a reasonably large sample
size (e.g., n = 30), the sampling distribution of the mean will be normally distributed, even if the
distribution of scores in your sample is not.
This theorem says that even when you have a non-normal distribution in a population, the
sampling distribution of the mean will most likely approximate a normal, bell-shaped
distribution as long as you have at least 30 cases in your sample.

2.4.3 The t-distribution


This test is used for samples of small sizes that are not distributed normally. With larger sample
sizes (n>=120) the distribution is identical to the normal distribution.
Whenever the population standard deviation is not known and an estimate from a sample must be
used, it is wise to use the family of t distributions.
When is known:
=



40

When is not known:


=

Where:
=
=
=
=
These equations help us to address the question below:
With known population mean, what is the probability of having a sample distribution with a
particular mean?
Example:
The average American man exercises for 60 minutes a week. Suppose, further, that I have a
random sample of 144 men and that this sample exercises for an average of 65 minutes per week
with a standard deviation of 10 minutes. What is the probability of getting a random sample of
this size with a mean of 65 if the actual population mean is 60 by chance?
=

65 60
10
144

5
0.83

= 6.02
From t-tables, the probability of getting a t value of this size or larger by chance with a sample of
this size is less than 0.001

41

Self Assessment 2.4

An article in the Journal of Heat Transfer described a new method for measuring the thermal
conductivity of Armco iron. Using a temperature of 100 oF and a power input of 550 W, the
following 10 measurements of thermal conductivity were obtained (in Btu/hr-ft- oF):
41.60, 41.48, 42.34, 41.95, 41.86,
42.18, 41.72, 42.26, 41.81, 42.04
Determine the standard error of the sample mean.

SESSION 2.5: Examples in Normal Distributions Applied to Engineering Problems


(Courtesy Montgomery, Runger, & Hubele, 2000)
42

2.5.1 Part 1 Examples in Normal Distributions and z-Scores


Find the probability P and represent it on a normal distribution diagram, under the following
assumptions for the normalized score Z:
(1) ( > 1.26)
(2) ( < 0.86)
(3)

( > 1.37)

(4) (1.25 < < 0.37)


(5) ( 4.6)
(6) Find z such that ( = ) = 0.005
(7) Find the value of z such that ( < < ) = 0.9

43

Q1-SOLUTION
P(Z>1.26)

=1-P(Z1.26)= 1-0.89616=0.10384 or 10.384%


Q2-SOLUTION
P(Z<-0.86)

From normal distribution tables:


P(Z<-0.86)= 0.19490
Q3-SOLUTION
P(Z>-1.37)

44

Remember normal distributions are symmetrical!


P(Z<1.37)=0.91465
Q4-SOLUTION
P(-1.25<Z<0.37)

P(Z<0.37)-P(Z<-1.25)
P(Z<0.37)=0.64431;
P(Z<-1.25)= 0.10565
P(Z<0.37)-P(Z<-1.25)= 0.64431-0.10565=0.53866
Q5-SOLUTION
P(Z-4.6)

P(Z-4.6) is not available in the tables, but using the last score of -3.99; P(Z-3.99) = 0.00003
Implying that P(Z-4.6) is negligible.
Q6-SOLUTION
Find the value of z such that P(Z>z)=0.05

45

Z in the inequality above is the same as pertains in P(Zz)=0.95


Search through the probabilities in the Tables for the value that corresponds to 0.95.
= .
Q7-SOLUTION
Find the value of z such that P(-z<Z<z)=0.99

Using the symmetry concept, the remaining area in the shaded region is (1-0.99)/2=0.005
The value for z corresponds to a probability of 0.995. The nearest probability is 0.99506 when
z=2.58

2.5.2 Part 2 Applying Normal Distribution to Engineering Problems


Questions
1. The compressive strength of samples of cement from a manufacturing company can be
modeled by a normal distribution with a mean of 6000 kg/cm2 and a standard deviation of
100kg/cm2
(i) What is the probability that a samples strength is less than 6250 kg/cm2?
46

(ii)
(iii)

What is the probability that a samples strength is between 5800 kg/cm2 and 5900
kg/cm2
What compressive strength is exceeded by 95% of the samples?

2. The fill volume of an automated filling machine used for filling cans of carbonated
beverage is normally distributed with a mean of 12.4 fluid ounces (fl oz) and a standard
deviation of 0.1 fluid ounce.
(i) What is the probability that a fill volume is less than 12 fluid ounces?
(ii) If all cans less than 12.1 or greater than 12.6 ounces are scrapped, what is the
proportion of cans id scrapped?
(iii)
Determine the specifications that are symmetric about the mean that include 99%
of all cans.

Learning Track Activities

Unit Summary
Research is essential in the practise of engineering and is necessitated by the need
to solve existing problems, an advantage that could be derived from improved
products and services, etc.
A clear methodology should be established in the conduct of experiments, being
conscious of the possible sources of errors including the inherent errors in
equipment used.
Data generated from experimental work can be presented by the probability
distribution curves such as the normal, binomial and Poisson distributions, and
from these important inferences could be made.

47

Key terms/ New Words in Unit


Least Count
Sampling error
Uncertainty
Random variable
Standard error
Expected value
Z-score
Standardization

Unit Assignments 2
Calculate z and z for questions 1 and 2:
1. = 3 for = (. . ) m
2. = ( + )with = (. . )m, = (.
. )m, = (. . )m, = (. . )m2 .
3. The reaction time of a driver to visual stimulus is normally distributed, with
a mean of 0.4 second and a standard deviation of 0.05 second.
a. What is the probability that a reaction requires more than 0.5 second?
b. What is the probability that a reaction requires between 0.4 and 0.5
second?
c. What is the reaction time that is expected 90% of the time?

48

Unit 3
SOCIAL SCIENCE RESEARCH DESIGN AND DATA
ANALYSIS
Introduction
It is common knowledge that research works within the social sciences draw on various longestablished traditions (viz. anthropology, psychology, sociology, etc). Fundamentally, social
science research works are concerned with people and their life contexts, and seek to answer
philosophical questions relating to the nature of knowledge and truth, values and being which
underpin human judgments and activities. One of the fundamental distinctions between social
science research and that of the natural sciences is the focus of the former on people saddled with
the unpredictability of human behaviour. The natural sciences, for e.g. medical researchers, are
able to use probability theories to develop therapeutic drugs because bodily systems function
relatively autonomously from the mind. Social science researchers are however unable to
develop such powerful solutions to social problems since the mind enables individuals and
groups to take decisions that vary widely with different motives.

The purpose of this unit is to introduce participants to research designs and the array of research
methodologies within the social sciences. Emphasis is placed on the empirical social science
research which involves the design of data collection instruments and the collection,
management, simulation, analysis and presentation of data about people and their social contexts
by a range of methods.

Learning Objectives
After reading this unit you should be able to:
1.

Identify the research approaches available to researchers in the social


sciences;

2. Know the factors that affect the effectiveness of these research designs
3. Operationalise these research approaches in ways that the weaknesses do

not limit the credibility of the research findings.

49

UNIT CONTENTS
SESSION 3.1: SURVEY RESEARCH
3.1.1 Introduction to Survey Research
3.1.2 Detailed Steps in Conducting Successful Surveys
SESSION 3.2 CASE STUDY RESEARCH
3.2.1 Introduction to Case Study
3.2.2 Purpose of Case Studies
3.2.3 Advantages of Case Study
3.2.4 Disadvantages of Case Study
3.2.5 Designing a Case Study
3.2.6 Categories of Case Study
3.2.7 How to Select Cases in a Case Study Research
SESSION 3.3 OTHER TYPES OF RESEARCH DESIGNS
3.3.1 Observational Research
3.3.2 Ethnographic Research (Ethnography)
3.3.3 Historical Research
3.3.4 Descriptive Research
3.3.5 Explanatory Research and Research on Causality
3.3.6 Comparative Research Design
3.3.8 Experimental Design
SESSION 3.4 RESEARCH ETHICS
3.4.1 Why Research Ethics
3.4.2 Balancing Costs and benefits in Research
3.4.3 Informed Consent
3.4.4 Competence
3.4.5 Privacy

50

SESSION 3.1 SURVEY RESEARCH


3.1.1 Introduction to Surveys
A survey is a systematic method of collecting data from a population of interest. Generally,
survey research tends to be quantitative in nature and aims to collect information from a sample
such that the results are representative of the population within a certain degree of error. Survey
gathers quantitative information, usually through the use of a structured and standardized
questionnaire. They are appropriate for assessing perceptions, opinions, knowledge, attitudes and
behaviors using structured questionnaires which are often close-ended.

3.1.2 Detailed Steps for Conducting a Survey Research


There are about 12 steps in conducting a survey research. These steps are briefly described
below.

3.1.2.1 Clarify the Purpose of the Research


Clarify the purpose of the research. This step will spell out the reasons for conducting the survey
and identify who will be involved in the design and data collection exercises. It is also important
to clarify that the survey is the best approach for the collection of the information required. In
seeking to clarify the research, the following questions may be relevant:

Why conduct a survey?

Who are the stakeholders (primary and secondary stakeholders)? That is who is interested
and/or has influence over the survey?

Who is the population of interest? Demographic characteristics; Where do they live?


What is the best means of communicating with them (medium, time of day and time of the
week); What is the best way to reach them (i.e. direct interviews, mail, telephone)?

Are you interested in any sub-groups of this population? Determining the characteristics
of the population of interest gives the researcher some indication of how he can get a
representative sample, whether he needs to set quotas for subgroups, and how many
people he would need to survey.

51

Self Assessment 3.1

You have observed that several children of school-going age are not in school in a
particular farming community. You want to investigate the causes of this phenomenon
through a survey.

Which sub groups would be of interest to you as a researcher?

When will you undertake the survey?

Answer Tips

List everyone who has a stake in childrens schooling in a locality.

Identify the reasons why timing is of the essence in data collection.

3.1.2.2 Resource Assessment


Following the clarification of the surveys purpose is an assessment of the resource requirements
for the survey. This stage helps the researcher to evaluate the adequacy of in-house resources to
enable him design a survey that is within the budget line. The in-house resource assessment also
enables the researcher know which resources he needs to contract out. The Health
Communication Unit at the Centre for Health Promotion, University of Toronto recommends the
following questions for a comprehensive resource assessment:

Which in-house resources are available for use? Staff availability and skills; logistics
(materials and equipment, etc).

What external resources will you need? After assessing internal resources, if any gaps
are identified in the resources required, they can be filled by external resources.

Is the budget enough to enable the researcher acquire these resources?

Further Insight
The budget line for a UNDP survey (which was to address gaps in data on energy access for
rural and urban areas in Ghana) undertaken by The Energy Center, KNUST was US$ 29 000.
The original intent was to survey 56 communities from the 10 administrative regions of Ghana.
52

After assessing the survey expenses, the budget could cover only 15 communities from three
administrative regions. If the planners had glossed over this important stage of the research
process, the survey would have landed on an impermeable rock.

3.1.2.3 Decision on the Methods and Procedures


The third step in a survey research is to decide on methods to be used. That is, the most
appropriate method required for the research work. Primarily, there are three methods for
obtaining survey research:

Face to face interviews;

Mailed and e-mailed questionnaires; and

Telephone and computerized telephone interviews.

Table 3.5: Advantages and Disadvantages of the Interview Methods


Method
Advantages
Disadvantages
Usually results in a higher
A social desirability bias may
response rate
affect the accuracy of
Reduces non-response to
responses, especially when
individual questionnaire items
survey is addressing sensitive
Interviewers can document
issues.
characteristics of non Recruitment and training of
Face-to-face
respondents and reasons for
interviewers is time
interviews
refusal.
consuming and expensive.
Preferable for survey
Cost per interview is
addressing complex issues
expensive.
where further explanations
would be required to ensure
clarity on the part of the
respondent.
Social desirability bias is
May not possible to
minimized
determine the demographics
Administrative costs and costs
and characteristics of nonper respondent are
respondents and/or reasons
Mailed and esignificantly reduced.
for refusal.
mailed
Some responses may not be
questionnaires
complete on returned
questionnaires.
The time taken to receive
completed questionnaires
may be long.
53

Telephone and
computerized
telephone
interviews

It is possible to achieve high

response rates.
Interviewers are able to
document characteristics of

non-respondents and reasons


for refusal.
The amount of non-response to
questionnaire items can be
minimized.
Able to obtain results quickly
Less costly than face to face
interviews (but more expensive
than mail surveys).

Sometimes difficult to reach


a selected resident of a
household.
Complex questions make it
difficult for respondents to
retain the questions and
response categories.

Self Assessment 3.2

In the exercise in 3.1, and with your appreciation of the pros and cons of the three
interview methods in Table 3.1, which of the methods will you use in your home district?
Briefly explain the factors you considered in your choice of the most appropriate
interview method.

Answer Tips

The methods you will use should be informed by how effective each of the
methods will be in your home-district.

3.1.2.4 Design the Questionnaire


Questionnaires should be designed to address the research objectives. In designing the
questionnaires, it is crucial to note that the quality and usefulness of the information collected
will depend on how the questions are worded. Vague questions will result in the collection of
less useful responses or cause non responses. The researcher could be guided by the following
guidelines:

54

Language and Wording


Proper wording of the questions is essential. The questions should be simple and straightforward
to ensure that respondent understands the questions correctly. Highly technical terms, slang,
abbreviations or words, which may be considered as insulting should be avoided. All the
questions should be available in the native language of respondent.

Recall Bias
When formulating the questions, it is imperative to have in mind that people tend to forget
events. When the recall period is longer the accuracy is often worse. Therefore, recall of the
events should be assisted by adding aids to the questionnaire and by ordering the questions. For
example holidays and national festivals can be used or the respondents can use a calendar.

Order of the Questions


The order of the questions in the questionnaire is also important. A poorly organized
questionnaire may confuse respondent, bias the responses, has an effect on response rate, as well
as willingness to answer sensitive questions. The questionnaire should start with the easy
questions. When more difficult questions are placed at the end of the questionnaire and if
respondent stops answering, at least some data for earlier questions have been collected. Asking
the easy questions first may lead to the establishment of a healthy rapport and thus the
respondent may be willing to answer the difficult (or probing) questions.

Length of the Questionnaire


The length of the questionnaire affects the response rate as well as reliability of the data. With
long questionnaires, the respondents often get careless towards the end of the interview which
will affect the reliability of the responses. A short questionnaire increases the response rate but
may lack important questions for the indicators. The ideal length for a self-administered
questionnaire is 15 minutes and for the face-to-face interview 30 minutes. Sometimes, loner
55

questionnaires may not be avoided. Here, the interviews could be phased out so that the
responses will not suffer.

Types of Questions
Two types of questions are used in questionnaires:

Structured Questions
These are questions that are followed by a list of possible alternative responses from which
respondents select responses that best describes their situation. It is impossible to list all possible
alternative responses so what is normally done is to provide space for responses that were not
mentioned in the list. In such cases it is customary to provide additional space for a response
labelled others {specify}. This takes care of all other responses, which do not fit in the list of
alternative responses provided. (Look at the examples provided as an attachment.)

Merits of Structured questions

They are easier to administer and to analyse since they are in an immediate usable form.

They are economical to use in terms of time and money.

They are easier to administer because of the alternative answers provided.

Demerits of Structured questions

They provide limited responses or responses provided may fall short of the responses that
respondents may provide.

Respondents are compelled to answer questions according to the respondents choices.

More difficult to construct because one needs to carefully think through the categories of
response to provide.

Unstructured Questions
Unstructured or open-ended questions are those that are left open for respondents to provide
answers. Responds have the freedom to provide answers that they think are appropriate
56

irrespective of what the researcher thinks. Individuals respond in their own ways and the length
of the response is determined by the kind of space provided for the response. E.g. where a little
space is provided, a short response is provided too. The reverse is also true. An instrument can
have both open and close ended questions based on the objective behind each question.
Merits of Structured questions

Open-ended questions have the tendency to stimulate respondents to think about their
feelings or motives to express what he/she may consider as an appropriate or most
important.

Responses express respondents feeling about a particular issue.

Responses say a lot about the responds in terms of their background, hidden motivation,
decisions and interests.

Open-ended questions are easier and simpler to formulate.

They allow for a greater depth of response.

Demerits of open-ended questions

This approach has a tendency of allowing people to provide irrelevant information or


information which does not answer the questions or objectives.

It could be time consuming. What is the implication of this?

It could be difficult to categorise such responses and hence difficult to analyse


quantitatively. (Where information cannot be categorized, it is better including it in the
narrative to be sure it does not get lost. Some open-ended questions are good for
qualitative purposes.)

c) Contingency questions
Where certain questions are only applicable to certain groups, they are followed with other
questions, which are referred to as contingency questions. Follow-up questions are required to
get further information from the relevant sub-groups. Thus subsequent questions asked after the
initial question are called contingency questions or filter questions. They are used to probe for
more information.

57

An example of contingency questions is as follows:


3. Have you eating today?
Responses:

Yes (if yes, please move on to question no. 4)


No (if no, please move to question no. 5)
4. If Yes, explain.

d) Matrix Questions
These are questions where a set of responses is used to answer all the questions. Likert scales are
usually used for such responses, such as extremely satisfied, satisfied, dissatisfied extremely
dissatisfied. An example of this is shown below:
Responses: 1 Extremely satisfied
2 satisfied
3 neutral
4 dissatisfied
5 extremely dissatisfied

How satisfied are you with your research methodology lecturer who is not into
management or business administration?

How satisfied are you with the research methods lectures so far as far as your
research work is concerned?

How satisfied are you with the length of time allocated for each lecture? Etc.

Merits of Matrix Questions

Space is used efficiently

It is easier to complete questions presented in a matrix form.

They are easier to complete such questions.


58

They rarely put off respondents

Easy to compare responses given to different items.

It facilitates easy determination of a trend in the response.

Demerits of Matrix Questions

It is often abused because of the way it is easily constructed and provides responses.

It can easily influence a pattern of responses from respondents when they make up their mind not
to provide right responses.

Table 3.6: Advantage sand Disadvantages of Open-ended and Close questions


Types
of Advantages
Disadvantages
questions
Elicit rich qualitative May discourage responses from less
data
literate respondents
Encourage thought and Take longer to answer and may put
Open-ended
freedom of expression
some people off
Are more difficult to analyze
responses can be misinterpreted.
Elicit quantitative data
Can suggest ideas that the respondent
Can
encourage
would not otherwise have.
mindless replies
Respondents with no opinion or no
Are easy for all literacy
knowledge can answer anyway
levels to respond to
Respondents can be frustrated
Are quick to answer and
because their desired answer is not a
may
improve
your
choice
response rate
It is confusing if many response
Close questions
Are easy to code and
choices are offered
analyze
Misinterpretation of a question can
go unnoticed
Marking the wrong response is
possible
They force respondents to give
simplistic responses to complex
issues

59

Self Assessment 3.3

The purpose of this exercise is to enable students understand the rules in writing questions in
surveys. Offer reasons why the following rules must be observed in drafting survey questions:

Avoid leading questions: E.g. Wouldnt you say that, Isnt it fair to say

Be specific. Avoid words like regularly, often, or locally.

Avoid jargon and colloquialisms.

Avoid double-barreled questions. E.g. Will you like to use charcoal and LPG?

Avoid double negatives. E.g. Smoking in public places should not be abolished.

Why you have to explain the rationale for asking very personal and probing issues?

Ensure options are mutually exclusive e.g. How many years have you worked in
academia: 0-5, 6-10, 11-15, over 15. Not, 0-5, 5-10, 10-15.

Answer Tips

Consider the kind of responses you will elicit from your respondents if the above
errors are not avoided.

3.1.2.5 Pilot Test the Questionnaires


A pilot test is an evaluation of the specific questions, format, question sequence and instructions
prior to use in the main survey. Pilot testing is a crucial step in avoiding costly errors. The pilot
testing of survey instruments helps to:

Ascertain whether each question addresses the research questions;

Know if the questions are interpreted in a similar vein by interviewers/enumerators and


respondents;

Identify if options provided for close-ended questions are exhaustive. That is, they
address the views of all respondents;

Assess clarity and understandability of the questions;


60

Evaluate to know if the instruments takes a long time to administer;

Ascertain if the questions are obtaining responses for all the different response categories
or if responses the same.

Have a fair knowledge of the kind of reactions to expect from respondents in order to
prepare to meet them in the main survey.

3.1.2.6 Preparation of the Sample


Sampling is used to cut cost and effort while still obtaining information from a representative
sample of the target population. What is essential here is that the researcher should ensure that
the number of individuals participating in the survey is representative of the target group. The
main questions in selecting your sampling design are:

How many will be included (the sample size)?

How will the survey respondents be selected?

Determining the Sample Size


Below are some relevant questions to consider in deciding on the sample size:

What is the size of your target population?

What can the budget allow?

How confident do you need to be with the results?

Do you need to look at any subgroups?

Deciding on the sample size is primarily driven by the budget line) and the size of the subgroups
the researcher wishes to analyze. The researcher has to ensure that he has sampled enough people
to obtain an adequate number of respondents in his subgroups so he can accurately draw
conclusions about that group. If the target population is very small (say less than 100), the
researcher should consider doing a census (i.e. complete enumeration). However, if the target
population is very large (for e.g. in millions) the researcher will not improve the accuracy of his
results by interviewing more and more people albeit how expensive it will be to cover everyone.
The sample sizes are often determined statistically at significance levels. Miller and Brewer
(2003) model can be a useful tool for determining the sample size.
61

Formula:

= 1+()2. Where n is the sample size; N is the sample frame (total number of objects in the
target population) and is the confidence level.
Working Example 1:
It has been observed that the performance of students in examinations has been declining over
the years. In a school with a population of 10 000 students, a researcher wants to know the
causes of the poor performance. How many students will you sample if your budget and time
will not permit a census?

Solution:
The question the researcher needs to answer first is the error he is ready to accept. If settled, then
he can go ahead and determine the sample size. Assuming the researcher wants the error margin
to be 5% the sample size can be determined as follows:

Student population = 10 000


=

1 + ()2

n = 10 000 / 1 + 10 000 (0.05)2


n = 384.6
Thus, approximately 385 students would be required for the survey at 95% significance level.

Characteristics of a Good Sample Design

Sample design must result in a truly representative sample. A representative sample is a


segment of a population being studied chosen because it is as representative as possible
of the population from which it is drawn.

It must be such which results in a small sampling error,

It must be viable in the context of funds available for the research study.

It must be such that systematic bias can be controlled in a better way.


62

It should be such that the result of the sample study can be applied in general, for the
universe, with a reasonable level of confidence

Sampling (i.e. how will the survey respondents be selected?


After the sample size has been determined, the next question to address is how to select the units
for the sample. Should it be random or follow other approaches? If even, the selection is by a
random approach, how is it operationalised? There are several approaches available to be used to
select respondents. These approaches are categorized into two; namely probability and non
probability sampling techniques

Probability sampling requires that each member of the defined target population has a known,
and non-zero, chance of being included in the sample. It is not possible to determine whether a
non-probability sample is likely to provide very accurate or very inaccurate estimates of
population parameters. Consequently, these types of samples are not appropriate for dealing
objectively with issues concerning either the estimation of population parameters or the testing
of hypotheses.
The use of non-probability samples is often carried justified that estimates derived from the
sample may be linked to some hypothetical universe of elements rather than to a real population.
In some circumstances, probability sample design can be turned accidentally into a nonprobability sample design if subjective judgement is exercised at any stage during the execution
of the sample design.

Types of Probability Samples


There are many ways in which a probability sample may be drawn from a population. The
method that is most commonly described is the simple random sampling. The others are
stratified, cluster sampling, and multiple stages of selection.
a. Random sampling

63

The first statistical sampling method is simple random sampling. In this method, each item in the
population has the same probability of being selected as part of the sample as any other item.
Random sampling can be done with or without replacement. If it is done without replacement, an
item is not returned to the population after it is selected and thus can only occur once in the
sample.

Having determined a sample size of 385 in example 1, the simple random sampling technique
will be operationalised by assigning numbers to all the 10 000 units and drawing them out from a
basket 385 times. This approach is simple random sampling without replacement.

Advantages

The selection procedure ensures that every sampling units of the population has an equal
and known (non zero) probability of being included in the sample.

Highly representative if all subjects participate; the ideal

Disadvantages

Not possible without complete list of population members; potentially uneconomical to


achieve; can be disruptive to isolate members from a group; time-scale may be too long,
data/sample could change

b. Systematic Sampling with a Random Start


This consists of selecting every Kth sampling unit (called the sampling interval) of the population
after the first sampling unit is selected at random from the total sampling unit. That is, an
element of randomness is introduced into this kind of sampling by using random numbers,
usually from 1-10, to pick up the unit with which to start. The sampling interval (K) is
determined by dividing the sampling frame (N) by the sample size (n).
E.g. With a sample size of 385 students, the sampling interval will be 10 000 / 385 = 26.

64

Assuming the random number selected is 7, the next number to be selected will be 33 (i.e. 7 +
26), the next number will be 40 (i.e. 33 + 7). This process is continued till the sample size of 385
is reached.
Advantages

Systematic sampling is more convenient than random sample especially when


interviewers are untrained in sampling techniques-they can be instructed to select every
Kth person.

It is more convenient for the use with very large population or when large samples are to
be selected.

It is an easier and less costly method of sampling.

Each sampling unit in the population has a 1/K probability of being included in the
sample.

Disadvantages

It proves to be an inefficient method only if certain production process is defective as this


sample depends solely upon the random starting position. In practice, this method can be
used when list of population are available and are of a considerable length.

The system may interact with some hidden pattern in the population, e.g. every third
house along the street might always be the middle one of a terrace of three.

c. Stratified Sampling
The stratified sampling method is used when representatives from each subgroup within the
population need to be represented in the sample. The first step in stratified sampling is to divide
the population into subgroups (strata) based on mutually exclusive criteria. Random or
systematic samples are then taken from each subgroup. The sampling fraction for each subgroup
may be taken in the same proportion as the subgroup has in the population. For example, if the
person conducting a customer satisfaction survey selected random customers from each customer
type in proportion to the number of customers of that type in the population. Stratified sampling
can also sample an equal number of items from each subgroup.

65

Steps involves in stratified sampling:

Define the population;

Determine the desired sample size;

Identify the variable and subgroups (strata) for which you want to guarantee appropriate
representation (either proportion or equal); and

Classify all members of the population as members of one of the identified subgroups.

Randomly select (using table of random numbers) an appropriate number of individuals


from subgroups.

Advantages

Can ensure that specific groups are represented, even proportionally, in the sample(s)
(e.g., by gender), by selecting individuals from strata list

Disadvantages

More complex, requires greater effort than simple random; strata must be carefully
defined

d. Cluster Sampling
In cluster sampling, the population that is being sampled is divided into groups called clusters.
Instead of these subgroups being homogeneous based on a selected criterion as in stratified
sampling, a cluster is as heterogeneous as possible to matching the population. A random sample
is then taken from within one or more selected clusters. Cluster sampling can tell us a lot about
that particular cluster, but unless the clusters are selected randomly and a lot of clusters are
sampled, generalizations cannot always be made about the entire population.

Steps:

Define the population

Determine the desired sample size

Identify and define a logical cluster

Obtain, or make a list of all clusters in the population


66

Estimate the average number of population members per cluster

Determine the number of clusters needed by dividing the sample size by the estimated
size of the cluster

Randomly select the needed number of clusters (using a table of random numbers)

Include in the sample all population members in selected cluster

Advantages:

Generating sampling frame for clusters is economical, and sampling frame is often
readily available at cluster level

Most economical form of sampling

Larger sample for a similar fixed cost

Less time for listing and implementation

Also suitable for survey of institutions

Disadvantages:

May not reflect the diversity of the community.

Other elements in the same cluster may share similar characteristics.

Provides less information per observation than an SRS of the same size (redundant
information: similar information from the others in the cluster).

Standard errors of the estimates are high, compared to other sampling designs with same
sample size

e. Multi-stage sampling
In many situations, there are natural divisions of the population into several different sizes of
units. For example, a forest management unit consists of several stands, each stand has several
cut blocks, and each cut block can be divided into plots. These divisions can be easily
accommodated in a survey through the use of multi-stage methods. Selection of units is done in
stages. For example, several stands could be selected from a management area; then several cut
blocks are selected in each of the chosen stands; then several plots are selected in each of the

67

chosen cut blocks. Note that in a multi-stage design, units at any stage are selected at random
only from those larger units selected in previous stages.

Example:
You have been asked to undertake a survey in a farming district in your home country. How
will you select respondents for interview?

Steps:

Note that not all the communities may be farming in the district. You need to
identify the farming communities. First stage.

The units to be sampled are the particular farming activities e.g. Food or Cash
Crop Production Second stage.

The units to be sampled from the farming activities (food or cash crop) are the
farming households who undertake the particular farming activity considered.
Third Stage.

Types of Non-probability Samples


There several types of non-probability samples: convenience, purposive/judgment, convenience,
quota samples, snowball, etc. These approaches to sampling result in the elements in the target
population having an unknown chance of being selected into the sample. It is always wise to
treat research results arising from these types of sample design as suggesting statistical
characteristics about the population rather than as providing population estimates with
specifiable confidence.

a. Convenience sampling
A sample of convenience is the terminology used to describe a sample in which elements have
been selected from the target population on the basis of their accessibility or convenience to the
researcher. Convenience samples are sometimes referred to as accidental samples for the
reason that elements may be drawn into the sample simply because they just happen to be

68

situated, spatially or administratively, near to where the researcher is conducting the data
collection.

Advantages

Convenience sampling is very easy to carry out with few rules governing how the sample
should be collected.

The relative cost and time required to carry out a convenience sample are small in
comparison to probability sampling techniques. This enables one to achieve the sample
size you want in a relatively fast and inexpensive way.

Disadvantages

Convenience sample can lead to the under-representation or over-representation of


particular groups within the sample.

The ability to make generalizations is undermined if the interest group is underrepresented in the sample

b. Quota sampling
It is sometimes misleadingly referred to as representative sampling because numbers of
elements are drawn from various target population strata in proportion to the size of these strata.
The population is stratified by important variables and the required quota is obtained from each
stratum.

Advantages

Quick and cheap to organize

Disadvantages

not as representative of the population as a whole as other sampling methods

because the sample is non-random it is impossible to assess the possible sampling error
69

c. Purposive Sampling
This is often referred to as judgment sample. With this technique the researcher selects sampling
units subjectively in an attempt to obtain a sample that appears to be representative of the
population. That is, the chance that a particular sampling unit will be selected depends on the
subjective judgment of the researcher. The selection of the researcher may yield results favorable
to his/her point of view, resulting in the entire setting vitiated with the element of bias. However,
the sampling technique assures that results obtained are tolerably reliable.

Advantages

Ensures balance of group sizes when multiple groups are to be selected

Disadvantages

Samples are not easily defensible as being representative of populations due to potential
subjectivity of researcher

d. Snowball sampling
Researchers use this sampling method if the sample for the study is very rare or is limited to a
very small subgroup of the population. This type of sampling technique works like chain referral.
After observing the initial subject, the researcher asks for assistance from the subject to help
identify people with a similar trait of interest. The process of snowball sampling is much like
asking your subjects to nominate another person with the same trait as your next subject. The
researcher then observes the nominated subjects and continues in the same way until the
obtaining sufficient number of subjects.

For example, if obtaining subjects for a study that wants to observe a rare disease, the researcher
may opt to use snowball sampling since it will be difficult to obtain subjects. It is also possible
that the patients with the same disease have a support group; being able to observe one of the
members as your initial subject will then lead you to more subjects for the study.
70

Advantages

The chain referral process allows the researcher to reach populations that are difficult to
sample when using other sampling methods.

The process is cheap, simple and cost-efficient.

This sampling technique needs little planning and fewer workforce compared to
other sampling techniques.

Disadvantages

The researcher has little control over the sampling method. The subjects that the
researcher can obtain rely mainly on the previous subjects that were observed.

Representativeness of the sample is not guaranteed. The researcher has no idea of the true
distribution of the population and of the sample.

Sampling bias is also a fear of researchers when using this sampling technique. Initial
subjects tend to nominate people that they know well. Because of this, it is highly
possible that the subjects share the same traits and characteristics, thus, it is possible that
the sample that the researcher will obtain is only a small subgroup of the entire
population.

The advantages and disadvantages of the various sampling techniques are summarised in Table
3.3.

Table 3.7: Sampling techniques: Advantages and disadvantages


Technique
Simple
random

Brief Descriptions
Advantages
Random sample
Highly representative if all
from whole
subjects participate; the
population
ideal

Disadvantages
Not possible without complete
list of population members;
potentially uneconomical to
achieve; can be disruptive to
isolate members from a group;
time-scale may be too long,
data/sample could change
71

Stratified
random

Random sample
from identifiable
groups (strata),
subgroups, etc.

Cluster

Random samples
of successive
clusters of subjects
(e.g., by
institution) until
small groups are
chosen as units

Purposive

Hand-pick subjects
on the basis of
specific
characteristics

Quota

Select individuals
as they come to fill
a quota by
characteristics
proportional to
populations
Subjects with
desired traits or
characteristics give
names of further
appropriate
subjects

Snowball

Can ensure that specific


groups are represented,
even proportionally, in the
sample(s) (e.g., by gender),
by selecting individuals
from strata list
Possible to select randomly
when no single list of
population members exists,
but local lists do; data
collected on groups may
avoid introduction of
confounding by isolating
members
Ensures balance of group
sizes when multiple groups
are to be selected

Ensures selection of
adequate numbers of
subjects with appropriate
characteristics

Possible to include
members of groups where
no lists or identifiable
clusters even exist (e.g.,
drug abusers, criminals)

More complex, requires greater


effort than simple random;
strata must be carefully defined

Clusters in a level must be


equivalent and some natural
ones are not for essential
characteristics (e.g.,
geographic: numbers equal, but
unemployment rates differ)

Samples are not easily


defensible as being
representative of populations
due to potential subjectivity of
researcher
Not possible to prove that the
sample is representative of
designated population

No way of knowing whether


the sample is representative of
the population

Self Assessment 3.4

As the Senior Research Officer of your organisation, you have been tasked to conduct a
household survey in a large town whose households fall into three income categories,
72

namely high income, middle income and low income earners. Determined to ensure that
the sample size takes care of the diversity in the target population, what sampling
technique will you use to select units and why?

3.1.2.7 Train Interviewers for Telephone and Intercept Surveys


Training interviewers involves providing them with the skills needed to undertake successful
interviewing. Having trained interviewers is imperative as the interviewer is the interface
between your organization and the respondents. Interviewers have a tremendous amount of
influence on the quality of the research. A good interviewer can make all the difference in the
world to the usefulness of the data collected.

What makes a Great Interviewer?


A great interviewer follows a few simple guidelines which ensure detailed, accurate, and
unbiased data.

Read the Questions as Written

Do Not Suggest Responses

Clarify Responses

Probe for Responses

Record Information Neatly and Thoroughly

Maintain Strict Confidentiality

Be Polite and Professional

3.1.2.8 Data Collection


This step describes how the information is collected for the different survey methods. This is an
important step, that must be done right in order to ensure the integrity of the information
collected. The following procedures are to be observed in data collection:

Face-to-face Interviews

73

Select location(s) to conduct interviews: The most appropriate location to conduct a face
to face interview is a place where members of your population frequent and is
comfortable for them to participate at that location.

If you are randomly selecting respondents for a face to face intercept interview it is
important to utilize more than one location in order to ensure a better representation of
the population.

Train interviewers in how to conduct a structured questionnaire face to face and how to
intercept respondents if they are doing intercept interviews. It is quite difficult to ensure
the interviewers randomly select people to participate in intercept interviews. Interviewer
and respondent biases may influence the people who are selected to participate and those
who agree to. Interviewers should follow a standardized and systematic approach to
selecting people who pass by to be interviewed.

If you require a particular group for your survey you may have to develop a questionnaire
screener which would be used to find eligible respondents. A questionnaire screener is a
series of one or two questions (usually demographics like age or family status) which
help you to identify people who are in your target population before doing a full.

Using Telephone Surveys

It is important to supervise interviewers when they are calling respondents to monitor


whether they are following the interviewing protocol.

It is important to verify a sample of completed interviews by calling a sample of


respondents who completed interviews to ensure they did complete the interview.

Do not distribute your sample to interviewers all at once; give each interviewer chunks of
sample as needed.

If you require a particular group for your survey you may have to develop a questionnaire
screener which would be used to find eligible respondents. A questionnaire screener is a
series of one or two questions (usually demographics like age or family status) which
help you to identify people who are in your target population before doing a full
interview. If a person is not eligible the interview is ended after the screening questions.
74

Using Mail Surveys

Send out the first mailing (usually results in a 40% response)

Send a reminder card 10 days after the 1st mailing to thank those participants who have
already responded and to remind those who have not of the importance of the study. The
card should also indicate where people can obtain another copy of the questionnaire if
they have mislaid their original copy.

Three to four weeks later, send a second mailing emphasizing the importance of receiving
responses. Also include a new questionnaire and return envelope.

The covering letter is one of the most important aspects of a mailed questionnaire. It will
determine whether the recipient reads the survey and the attitude with which respondents
complete the questionnaire.

The letter should explain why the study is important and why their responses are needed.

3.1.2.9 Processing the Data


Processing the data involves preparing and translating the data for analysis. It involves taking the
completed questionnaires and putting them into a format that can be summarized and interpreted.
There are many errors that can be made during this step and it is essential that the quality of the
data is preserved.
Coding
The following are the steps involved in coding respondents answers to your questionnaire:

Familiarize yourself with the questionnaire and topic area.

Divide open ended questions into groups that can share a code list (not always possible).

For each question (or group) read through at least 15% of the questionnaires writing
down all the unique responses (this is a rough code list).

When no new responses are found, rewrite codes and assign a number to each code
(master code list).

Write the corresponding code number(s) beside each open-ended question on each
questionnaire.

Repeat this for each open ended question.


75

Data Entry
There are two common approaches to data entry:

Direct data entry. Interviewers complete the questionnaires and then they are coded data
entered into a computer for analysis.

Computer assisted telephone interviewing (CATI). Interviewers enter responses directly


into a computer and the questions required coding are entered at a different time.

Methods to Avoid Data Entry Errors

Data entry errors are minimized when the data is verified. Verification of 10% of the data
entered results in increased confidence in the accuracy of the data.

An additional means to reduce the incidence of data entry errors is to program your data
entry program to check each field for out-of-range data. When errors or inconsistencies
are identified, the ID number of the record is used to locate the questionnaire. The source
of the error is identified and the corrected data is entered.

3.1.2.10 Analysis of Results


Once the data has been entered into your statistical package, the analyses required to answer your
research questions can be performed. Analyzing the survey results is done in order to answer the
original questions that were posed for the evaluation. It allows you to draw conclusions.
Analyzing the results is one of the most crucial steps in the process of ensuring useful findings
which accurately reflect the opinions and views of the participants involved and answers the
original questions.

Both quantitative and qualitative methods are employed for data analysis. The qualitative inquiries
capture areas where in-depth information is required for better understanding of issues. The qualitative
data will also serve as a means of triangulating data gathered through the quantitative approach and
providing in-depth explanation to some of the quantitative data. The quantitative analysis is good for
generalization and numbers. The analysis can be done using SPSS, STATA or any other statistical
software which will be discussed in Unit 4.
76

For most surveys simple descriptive statistics (frequencies, means, ranges, etc) may be all that is
needed to be able to interpret the results. This involves determining how many of the respondents
answered a particular way for each of the questions. More complex analyses may be required
when comparisons are needed between subgroups of the population or for measurements taken at
different times.
Statistical analysis aims to show that your results are not just due to chance or the luck of the
draw. It provides a way to determine the repeatability of any differences observed. If the same
outcome is found when a study is repeated over and over again, we really dont need a statistical
analysis. Similarly when we study a sample of the population, statistical analysis is used to help
us decide whether it is likely that these same differences would be found if we repeated the
experiment in multiple samples or in the entire population. Hypothesis could be tested with
common statistics tools such as the T-tests (to compare results for continuous data), Z-test or Chi
square (to compare results for categorical data).

3.1.2.11. Interpret and Disseminate Results


The results of a survey should be provided back to its through written reports, and/or
presentations. It is important to feed back the results of the survey to management, staff,
interested participants and other stakeholders in order to keep them informed and establish buyin for implementing any changes resulting from the survey.
Interpreting survey results

Survey results need to be interpreted within the context of the purpose of the project.

Keep the audience in mind when preparing report. What do they need and want to know?

Consider the limitations of the survey (e.g. possible biases; validity of results, reliability
and generality of results.

Presenting Results

It is easy to become overwhelmed with too much information so focus on the research
questions and only present the information which answers those questions.
77

Choose a format which will highlight the key result.

Keep it simple

Pictures are worth a thousand words

3.1.2.12 Take Action


Taking action refers to implementing the changes suggested by the results of your survey. It is
important to take action and implement changes in order to make improvements to subjects
understudied.

How to Decide which Actions to Take

Involve the stakeholders in interpreting and taking action on the results.

Revisit the original goals of data collection. The data should provide answers to the
original questions.

Write a list of recommended actions which address the outcomes of the survey.

Prioritize those changes which are most important and feasible to implement.

Set up an action plan to implement the recommended changes.

Implement the changes.

SESSION 3.2 CASE STUDY RESEARCH


3.2.1 Introduction to Case Study
A case study is an intensive study of a single unit for the purpose of understanding a larger class of
(similar) units. A case study is an in-depth investigation of an individual, group, institution or
phenomenon. Case studies are often based on the premise that locating one case is enough make a
conclusion for other cases since a case can be typified for similar other cases. A case being studied is
taken as an example of other similar things/situations.
78

As a means of overcoming shortcomings of quantitative research studies, case study research are
often undertaken to have a holistic and in-depth investigation of social and behavioral problems
such as unemployment, poverty, drug addiction, governance, management and illiteracy.
Through case study methods, a researcher is able to go beyond the quantitative statistical results
and understand the behavioral conditions through the actors perspective. Whilst in quantitative
research certain peripheral but relevant information might be omitted and obscured, case study
research explains both the process and outcome of a phenomenon through complete observation,
reconstruction and analysis of the issue under study and thereby covers all relevant information.

A case study in true essence is the exploration, investigation or analysis of a contemporary


practical life phenomenon of a specific contextual scope, a small geographical area or a limited
population through detailed background examination of an event or condition.

3.2.2 Purpose of Case Studies


According to Singleton et al. (1993), the primary purpose is to determine factors and
relationships among the factors that have resulted in the behaviour understudy. The investigation,
therefore, involves a detailed examination of a single subject, group or phenomenon.

3.2.3 Advantages of Case Study


There are a number of advantages in using case studies.

The examination of the data is most often conducted within the context of its use (Yin,
1984), that is, within the situation in which the activity takes place.

Variations in terms of intrinsic, instrumental and collective approaches to case studies


allow for both quantitative and qualitative analyses of the data.

The detailed qualitative accounts often produced in case studies not only help to explore
or describe the data in real-life environment, but also help to explain the complexities of
real-life situations which may not be captured through experimental or survey research.

3.2.4 Disadvantages of Case Study


79

Despite these advantages, case studies have received criticisms. There are primarily three types
of arguments against case study research.

Case studies are often accused of lack of rigor. Too many times, the case study
investigator has been sloppy, and has allowed equivocal evidence or biased views to
influence the direction of the findings and conclusions.

Case studies provide very little basis for scientific generalization since they use a small
number of subjects, some conducted with only one subject. The question commonly
raised is How can you generalize from a single case?

Case studies are often labeled as being too long, difficult to conduct and producing a
massive amount of documentation. In particular, case studies of ethnographic or
longitudinal nature can elicit a great deal of data over a period of time.

A common criticism of case study method is its dependency on a single case exploration
making it difficult to reach a generalizing conclusion.

3.2.5 Designing a Case Study


Case studies have been generally criticized for their lack strength as a research tool. This makes
its design very important. Depending on the issue at hand, a single case or a multiple case can be
adopted. In situations where there is no room for replication of a particular study or it is rare,
uncommon and limited to a single occurrence, a single case can be adopted.

Single case, though generally limited by its inability to provide generalizing conclusions, the
drawback can be overcome by triangulating the study with other methods to authenticate the
validity of the process. Multiple-case design, on the other hand, can be adopted with real-life
events that show numerous sources of evidence through replication rather than sampling logic to
enhance and support previous results. This helps raise the level of confidence in the strength of
the method adopted. For instance, whilst a study on the psychological impacts of the 1983
drought on children may difficult to be replicated and hence appropriate for a single case study,
the assessment of the sensing ability of deaf children is replicable and hence more appropriate
for multiple case study. The design of a case study is therefore very important. A case study
method must be able to prove, through interviews or journal entries, that:
80

It is the only viable method to elicit implicit and explicit data from the subjects

It is appropriate to the research question

It follows the set of procedures with proper application

The scientific conventions used in social sciences are strictly followed

A chain of evidence, either quantitatively or qualitatively, are systematically recorded


and archived particularly when interviews and direct observation by the researcher are the
main sources of data

The case study is linked to a theoretical framework.

3.2.6 Categories of Case Study


Though there are several types of case studies, the prominent ones are explained below:

3.2.6.1 Explorative case studies


These are intended to investigate information which serves as a point of interest to the
researcher. This category of case study, owing to its somewhat originality of the subject matter,
prior field work and small scale data collection needs to be conducted before the research
question and hypotheses are proposed to help prepare the framework of the study. An example of
an explorative study is a pilot study.

3.2.6.2 Descriptive case studies


Second, descriptive case studies set to describe the natural phenomena which occur within the
data in question. The aim of the researcher is to describe or narrate the data in their original state.
The challenge of a descriptive case study is that the researcher must begin with a descriptive
theory to support the description of the phenomenon or story. If this fails there is the likelihood
that the description lacks thoroughness and that problems may arise during the project.
3.2.6.3 Explanatory case studies
Explanatory case studies examine the data both at a surface and deep level in order to explain the
phenomena in the data. On the foundation of the data, the researcher then forms a theory and set
to test it. Furthermore, explanatory cases are also deployed for causal studies where patternmatching can be used to investigate certain phenomena in very complex and multivariate cases.
81

The complex and multivariate cases can be explained by three rival theories: a knowledge-driven
theory, a problem-solving theory, and a social-interaction theory.

The knowledge-driven theory stipulates that eventual commercial products are the results of
ideas and discoveries from basic research. Similar notions can be said for the problem-solving
theory. However, in this theory, products are derived from external sources rather than from
research. The social-interaction theory, on the other hand, suggests that overlapping professional
network causes researchers and users to communicate frequently with each other.

3.2.6.4 Interpretative and Evaluative case studies


Through interpretive case studies, the researcher aims to interpret the data by developing
conceptual categories, supporting or challenging the assumptions made regarding them. In
evaluative case studies, the researcher goes further by adding their judgment to the phenomena
found in the data.

3.2.7 Techniques for selection of cases in Case Study Research


Case selection in case study research has similar objectives as random sampling. In case
selection, a researcher desires a representative sample which has useful variation on the
dimensions of theoretical interest. Ones choice of cases is therefore driven by the way a case is
situated along these dimensions within the population of interest. That is, how the case fits into
the theoretically specified population. The following steps are useful for the selection of cases:

Cases should be selected in the same way as the topic of an experiment is selected;

Developed preliminary theory is used as a template with which to compare the


characteristics and empirical findings from the case(s); and

Selected cases should reflect characteristics and problems identified in the underlying
theoretical propositions/conceptual framework.

SESSION 3.3 OTHER TYPES OF RESEARCH DESIGN


3.3.1 Observational Research
82

Observational study involves observing a phenomenon. For example, instead of asking how the
Black Stars are likely to perform in the World cup in Germany, you may observe them playing
prior to the trip to Germany. Observational research is also guided by clearly defined hypotheses
or objectives to make the research objective. The observations should be systematic rather than
opportunistic and disorderly.
3.3.1.1 Purpose of Observational Research

It is used to collect objective information. The information is said to be objective because


the researcher observes the behaviour rather than depending on the self-report as the
basic source of the information.

This method avoids the limitations associated with the survey research.

3.3.1.2 Steps in carrying out Observational Research

Selection and definition of the problem

Sample selection

Define observational variables (this is an important step in the research and what is
observed is determined by hypothesis and objectives)

Record observational information (there are four ways of doing this; duration recording,
frequency count recording, interval recording and continuous observation)

3.3.1.3 Types of Observational Research

Non-participant observation

Naturalistic observation

Simulation observation

Participant observation

Participatory Rural Appraisal/Action

3.3.1.4 Limitations of Observational Research

There is a high tendency to infringe on participants rights by observing people without


their knowledge and recording conversations with concealed recording devices.

83

There is a problem of the impact of the observers participation on the situation and the
subjects.

It could be very biased.

3.3.2 Ethnographic Research (Ethnography)


This is a method that involves very intensive data collection. The data on many variables are
collected over an extended period of time in a natural setting. The use of this method is based on
the belief that behaviour is greatly influenced by the environment in which it occurs.
Ethnographers do not study individuals outside the context in which the function occurs. The key
characteristic of ethnographic research is that the researcher (now the observer) goes through a
continuous process of observation, trying to record everything that occurs in the area being
studied making very lengthy notes of what is observed.

3.3.2.1 Steps in carrying out ethnographic studies

The ethnographer uses a variety of data collection strategies in conjunction with


observation. Involves non-participant observation, participant observation or both.

Like all other research, define the research problem

Determine the research hypothesis

Plan the research

Decide on appropriate setting to conduct the ethnographic research

Decide on the best level of participation

3.3.2.2 Advantages

Hypothesis or theories developed are grounded firmly in observational data gathered in a


naturalistic setting.

It provides a very vivid (life) picture of the environment being studied

The long period of study required in ethnographic research gives the research a
longitudinal perspective that cannot be achieved in many other types of research.

3.3.2.3 Disadvantages

84

Ethnographic research requires the skills of someone trained in observational techniques


to make results valid.

The outcome of the field data can easily be influenced by the observers bias.

Since the field reports are usually long hand written notes, such field records are usually
difficult to quantify and interpret.

Ethnographic research goes on for a long period of time, which makes it very expensive.
A lot of time is first of all devoted in trying to understand the environment where the
study will be carried out long before the study takes place, thus making it very expensive.

The observer is forced to become an active participant in the society/environment being


studied, which could lead to role conflicts (e.g. one can easily forget the role he/she is
expected to play and disclose his/her real self) and this could reduce the validity of data
being collected.

It requires an observer who is alert and a fast writer who can also write clearly.

3.3.3 Historical Research


Moore (1988) defines historical research as the study of a problem that requires collecting
information from the past. This type of research involves understanding, studying and
experiencing past events. Historical research studies do not involve the use of instruments to
gather data from individuals as in survey research, but makes use of existing data. Thus it is up to
the researcher to determine whether the data adequately explores the events in which he or she is
interested.
Historical research is also defined as the discovery and analysis of records of previous events,
interpretation of trends in the attitudes or events of the past and generalizations from these past
events to help guide present or future behaviour. Historical research consists of locating,
integrating and evaluating evidence from physical relics, written records or documents in order
to establish facts or generalizations regarding past or present events, human characteristics or
other problems in question Compton and Hall (1972). The historical researcher is interested in
understanding and analyzing the past. The research for evidence or facts is always guided by a

85

broad theory or interpretation relevant to the researchers interest and therefore the facts to not
speak for themselves.

Examples of historical sources of data, which could either be primary or secondary sources include;

Official records which may include legal records, legal instruments such as contracts and
wills, court decisions, etc.

Eye witness accounts of events, which could be given orally or in written form.

Creative productions such as works of art, photographs, literature, museum pieces and
costumes.

Expressive documents, such as personal letters, life histories (from diaries or


autobiographies, etc.).

3.3.3.1 Purpose of Historical Research


Historical research aims at arriving at conclusions concerning causes, effects or trends of past
occurrences that may help explain present events, which could be used to anticipate future
events. Thus historical events are useful for understanding;

Histories of specific individuals

Histories of political systems

Histories of important events of a country, e.g. wars, etc.

Historical research also attempts to interpret ideas or events that had previously seemed
unrelated. It emphasizes old data or merges old data with new historical facts that others have
discovered. Historical research is also used to reinterpret past events that have been studied.

3.3.3.2 Steps in conducting historical research

Identify the research problem. The problem must be of historical significance and this
makes this step difficult.

Developing research hypothesis or objectives that one wants to test.

Collecting and classifying research resource materials, determining facts by internal and external
criticism.

Organizing facts into results


86

Interpreting data in terms of stated hypothesis or theory. It is important to note that isolated facts
have no meaning and a mere listing of historical events is not research.

Synthesizing and presenting the research in an organized form.

3.3.3.3 Limitations of historical research

Collecting historical data involves long and tedious hours of search through piles of
records, files, documents, etc.

Establishing the validity of the data (source and content) involves a dual process of
internal and external audit/criticism, which is also time-consuming. The limitations in
this research raise a lot of ethical issues.

3.3.4 Descriptive Research


Although some people dismiss descriptive research as `mere description', good description is
fundamental to the research enterprise and it has added immeasurably to our knowledge of the
shape and nature of our society. Descriptive research encompasses much government sponsored
research including the population census, the collection of a wide range of social indicators and
economic information such as household expenditure patterns, time use studies, employment and
crime statistics and the like.
Descriptions can be concrete or abstract. A relatively concrete description might describe the
ethnic mix of a community, the changing age profile of a population or the gender mix of a
workplace. Alternatively the description might ask more abstract questions such as `is the level
of social inequality increasing or declining? `How secular is society?' or `How much poverty is
there in this community?' Accurate descriptions of the level of unemployment or poverty have
historically played a key role in social policy reforms (Marsh, 1982). By demonstrating the
existence of social problems, competent description can challenge accepted assumptions about
the way things are and can provoke action.

Good description provokes the `why' questions of explanatory research. If we detect greater
social polarization over the last 20 years (i.e. the rich are getting richer and the poor are getting
poorer) we are forced to ask `Why is this happening?' But before asking `why?' we must be sure
87

about the fact and dimensions of the phenomenon of increasing polarization. It is all very well to
develop elaborate theories as to why society might be more polarized now than in the recent past,
but if the basic premise is wrong (i.e. society is not becoming more polarized) then attempts to
explain a non-existent phenomenon are silly.

Of course description can degenerate to mindless fact gathering or what C.W. Mills (1959) called
`abstracted empiricism'. There are plenty of examples of unfocused surveys and case studies that
report trivial information and fail to provoke any `why' questions or provide any basis for
generalization. However, this is a function of inconsequential descriptions rather than an
indictment of descriptive research itself.
3.3.5 Explanatory Research and Research on Causality
Explanatory research focuses on why questions. For example, it is one thing to describe the
crime rate in a country, to examine trends over time or to compare the rates in different
countries. It is quite a different thing to develop explanations about why the crime rate is as high
as it is why some types of crime are increasing or why the rate is higher in some countries than in
others.
The way in which researchers develop research designs is fundamentally affected by whether the
research question is descriptive or explanatory. It affects what information is collected. For
example, if we want to explain why some people are more likely to be apprehended and
convicted of crimes we need to have hunches about why this is so. We may have many possibly
incompatible hunches and will need to collect information that enables us to see which hunches
work best empirically. Answering the `why' questions involves developing causal explanations.

Causal explanations argue that phenomenon Y (e.g. income level) is affected by factor X (e.g.
gender). Some causal explanations will be simple while others will be more complex. For
example, we might argue that there is a direct effect of gender on income (i.e. simple gender
discrimination). People often confuse correlation with causation. Simply because one event
follows another, or two factors co-vary, does not mean that one causes the other. The link
between two events may be coincidental rather than causal.
88

There is a correlation between the number of fire engines at a fire and the amount of damage
caused by the fire (the more fire engines the more damage). Is it therefore reasonable to conclude
that the number of fire engines causes the amount of damage? Clearly the number of fire engines
and the amount of damage will both be due to some third factor - such as the seriousness of the
fire.

Confusing causation with correlation also confuses prediction with causation and prediction with
explanation. Where two events or characteristics are correlated we can predict one from the
other. Knowing the type of school attended improves our capacity to predict academic
achievement. But this does not mean that the school type affects academic achievement.
Predicting performance on the basis of school type does not tell us why private school students
do better. Good prediction does not depend on causal relationships. Nor does the ability to
predict accurately demonstrate anything about causality.

3.3.6 Comparative Research Design


This design entails the study using more or less identical methods of two contrasting cases. It
embodies the logic comparison in that it implies that the researcher can understand social
phenomenon better when they are compared in relation to two or more meaningful contrasting
cases or situations. The key to the Comparative design is its ability to allow the distinguishing
characteristic of two or more cases to act as a springboard for theoretical reflections about
contrasting findings.

3.3.7 Longitudinal Design


This form of design represents a distinct form of research design because of the time and cost
involved. It is a relatively little used design in social research. Longitudinal research design is a
design in which data are collected on a sample (of people, documents, etc) on at least two
occasions.

89

Two types of Longitudinal Design


The Panel Study
With this type, a sample often a randomly selected is the focus of data collection on at least two
(and often more) occasions. The data may be collected from different types of cases within a
panel study framework: people, household, organization, schools etc.

The Cohort Study:


The study selects an entire cohort of people or a randomly selected sample of them as the focus
of data collection. The cohort is made up of people who share a certain characteristics, such as
all being born in the same week or having a certain experience, such as being unemployed or
getting married on a certain day in the same week.

The Panel and Cohort studies share similar features:

They share a similar design structure i.e. the data are collected in at least two waves on
the same variable on the same people.

They are both concerned with illuminating social change and improving the
understanding of causal influence over time- the causal influence implies that the
Longitudinal designs are somewhat better able to deal with the problem of ambiguity
about the direction of influence.

3.3.8 Experimental Design


Experimental Research design is that which rules out alternative explanations of findings
deriving from it (i.e. possess internal validity) by having at least

an experimental group, which is exposed to treatment, and a control group, which is not
exposed to treatment, and

Random assignment to the group.

3.3.8.1 Advantages
90

Experiments enable researchers to exert a great deal of control over extrinsic and intrinsic
variables, strengthening the validity of causal inferences (internal validity).

Experiments enable researchers to control the introduction of the Independent variable so


they may determine the direction of causation.

3.3.8.2 Disadvantages

External validity is weak because experimental design does not allow researchers to
replicate real-life social situation.

Researchers must often rely on volunteer or self-selected subjects for their samples.
Therefore the sample may not be representative of the population of interest, preventing
researchers from generalizing to the population and limiting the scope of their findings

A true Experiment is often used as a yardstick against which non-experimental research is


assessed. Experimental research is frequently held up as a touchstone because it engenders
considerable confidence in the robustness and trustworthiness of causal findings. That is, true
experiment tends to be very strong in terms of internal validity.

SESSION 3.4 RESEARCH ETHICS


3.4.1 Why Research Ethics
The ethics of conducting social science research has grown over the years and has to do with the
rights and welfare of those being researched as well as the obligation of the researcher. The
purpose of research as we have been saying is to contribute to knowledge. Unfortunately,
carrying out the research is likely to violate the rights and welfare of those being researched and
ethical codes have been developed to protect the interest of these people. Each of the stages in
the research process involves some ethical implications.
3.4.2 Balancing Costs and benefits in Research
Basically social scientists are confronted with two ethical issues; the right to conduct the research
in search of new knowledge and the right of the person providing the information. Not to
conduct the research for fear of infringing on the right of the research participants will not be fair
91

since it blocks the chances of gaining new knowledge and unethical to the researcher.
Conducting the research that abuses the right of the individual being researched could also be
unfair. This may be true in research that employs deception because provides methodological
and practical advantages. The above shows that social scientists often find themselves in a
conflict of ethical dilemma.
There are no absolute answers to the above conflict but it is important to be aware of it and to
guide against it as much as possible, or be able to manage it. Values people attach to the benefits
or cost of conducting research are based on so many factors including background, culture,
experience, convictions, etc. Some of the costs that the researcher may put the researched into
are affronts to dignity of the individual, embarrassment, loss of trust in social relations, loss of
self-esteem or self-confidence, etc. For the researcher the gains could be developing more theory
about the hidden agenda of people, potential advances of applied knowledge, etc. For the
researched, the gains could be the monetary benefits, satisfaction in contributing to knowledge,
etc. All ethical decisions have to be made individually.
3.4.3 Informed Consent
It is important to inform people to be researched about the research ahead of time and to seek
their concern. This is important especially where those to be research are exposed to risks of all
kinds (for example, when it has to do with drugs, theft, sexuality, etc.). It is also important to
know that providing responses to a researchers questions is voluntary. In order words one
should not force responses from respondents. The researcher after being unable to convince the
researched to provide a response should move to another person.

3.4.4 Competence
It is important to know that it is not everyone who is competent enough to provide informed
responses to questions posed by the researcher. It is often assumed that adults are capable of
providing response of any kind while children are not. This could be true or untrue depending on
the research topic. In some cases children may more competent in providing responses than
adults and vice versa. Ethically, competence must be taken into account in deciding on the
92

respondents. The freedom to decide whether to participate in a research or not is left to those to
be researched and so on ethical grounds it is considered as voluntary.
3.4.5 Privacy
Privacy as an ethical issue in research needs safeguarding. It is viewed from three angels:

a) Sensitivity of information
Sensitivity of information refers to how personal or potential threatening the information is that
the researcher is interested in. The greater the sensitivity of the information, the more the
researcher needs to provide privacy to the respondent. People are often sensitive about issues
related to religion, income, sexual practices, racism and personal attributes such as honesty,
intelligence, etc.

b) Settings being observed


The setting could vary from the private (e.g. home) to the public place. The extent to which any
of the above two places could be intrusion in peoples privacy is not certain which could lead to
an ethical issue. An example is trying to interview homosexual in a public drinking place.

c) Dissemination of the information


It should not be easy to match information with the people who provided it. Being able to do so
would mean not protecting the privacy of those who provided the information. It is easy to get
that done by not putting names of the questionnaires or research instruments used.

d) Anonymity and Confidentiality


This is similar to the information on dissemination of information under privacy. Here
researchers avoid collecting information with the identity of the one providing the data. A quick
way of ensuring anonymity is to collect information without the names of the respondents and
other identities. It is easy to maintain anonymity through a mail survey. Where the identity is
provided, the researcher can ensure anonymity by separating the other data from the identity of
the one who provided it during the data entry.
93

It is common to find that those being researched are told that any information they provide will
be taken as confidential. This is often written in the introduction letter that goes with the
questionnaire. It is also true that sometimes the researchers are unable to keep their promises due
to a number of factors. It could happen that the information provided is unique and therefore
stands out among the others. Such information could be used as an example and used to make a
case. In such a situation, the confidentiality promise will be broken. Thus it is important to explain to
those being researched what exactly is meant by confidentiality and its limits.

Learning Track Activities

Unit Summary

Social science research is always limited by the unpredictability of the human behaviour.
Premised on this, it has to be approached in a systematic manner devoid of biases. Depending
on the nature of the problem, several approaches can be used to address the research
problem. What is imperative is for the researcher to clarify the purpose of the research and
examine the suitability of a chosen research approach in addressing the research questions.
Noting that researchers use samples after which findings are generalised to represent the
population, it is imperative that units in the sample reflects nuances in the target population.

Another significant factor worthy to be consider is the need to observe the research ethics
which include but not limited to; the use of informed consent, competence and privacy. The
quality of data to be gathered depends primarily on the nature of instruments used. Hence,
the instruments are to be designed to gather the required data from respondents. The
94

questions should be unambiguous, devoid of technicalities and jargons to enable the


enumerator and respondent understand them to gather the required data.

Key terms/ New Words in Unit

Interviewer. The person who is collecting data by conducting interviews.

Respondent. The person who is answering the questionnaire.

Researcher. The person who is analyzing the data collected.

Sample. The list of people who will be interviewed.

Survey. An instrument designed to gather information from a specific group of people


(employees, customers, all people in a province or country, women, children, etc.)

Questionnaire. A set of questions designed for a specific purpose (evaluation, polling,


market research, etc.). Can be either printed on paper, or programmed into a
computerized interviewing system.

Closed-end. A type of question that allows only for specific responses (Yes or No, etc.).
The interviewer circles the response on the questionnaire.

Open-end. A type of question that allows the respondent to give any answer they wish.
The interviewer writes in, verbatim, the response.

Probe. Asking for more responses.

Precodes. A list of possible responses to a question. The instructions on the questionnaire


will inform the interviewer whether they should read the list or not.

Response rate: Response rate refers to the percentage of subjects that respond to the
questionnaires. A response rate of 70% or more is considered as very good.

Non-Respondents: This refers to those who do not respond to the questionnaire.

95

Unit Assignment
The Government of your country wants to curtail the impact of a hydro electric
power dam it is about to construct on the livelihood sources of people with the
proposed dams catchment area. The official record of the National Statistical
Service indicates that about 15 000 people are to be affected through inundation.

As a research fellow, you have been asked to carry out a preliminary assessment for
the implementers to evaluate the effects of their interventions and plan appropriately
to curtail the effects.

Use the narrative to answer the following questions:

Budget and times constraints inhibit you from undertaking a census. How
many people will you interview?

How will you select your units to reflect the nuances in the target
population?

What ethics will you apply to ensure that the survey process is not
compromised?

What types of data do you envisage and how will they be gathered?

96

Unit 4

STATISTICAL ANALYSIS WITH STATA AND SPSS


Introduction
Statistical analysis software packages such as SPSS and STATA provide complete,
comprehensive set of tools that can be used to perform various statistical procedures, such as line
plots, scatter plots, tables, regression analysis, bar charts, pie charts, dot charts, multivariate
analysis, time series analysis, survival analysis etc.

Learning Objectives
After reading this unit you should be able to:
8.

Use SPSS to do various forms of statistical


analysis.
9. Perform various forms of statistical analysis with
the STATA software.

Unit content
SESSION 4.1: INTRODUCTION TO SPSS
4.1.1 The Nature of SPSS
4.1.2 Data Management in SPSS
4.1.3 Descriptive Statistics in SPSS
SESSION 4.2: INTRODUCTION TO STATA
4.2.1 The Nature of STATA
4.2.2 Data Management in STATA
4.2.3 Descriptive Statistics in STATA
97

SESSION 4.1: INTRODUCTION TO SPSS


4.1.1 The Nature of SPSS
SPSS (Statistical Package for the Social Sciences) is a statistical analysis and data management
software package. SPSS can take data from almost any type of file and use them to generate
tabulated reports, charts, and plots of distributions and trends, descriptive statistics, and conduct
complex statistical analyses.

There are two important limitations of SPSS that deserve mention at the outset:
o SPSS users have less control over statistical output than, for example, Stata or Gauss
users. For novice users, this hardly causes a problem. But, once a researcher wants
greater control over the equations or the output, she or he will need to either choose
another package or learn techniques for working around SPSSs limitations;
o SPSS has problems with certain types of data manipulations, and it has some built in
quirks that seem to reflect its early creation. The best known limitation is its weak lag
functions, that is, how it transforms data across cases. For new users working off of
standard data sets, this is rarely a problem.But, once a researcher begins wanting to
significantly alter data sets, he or she will have to either learn a new package or develop
greater skills at manipulating SPSS.

4.1.1.2 Getting Started with SPSS


SPSS for Windows is a versatile computer package that can perform a wide variety of statistical
procedures. When using SPSS, you will encounter several types of windows. The window with
which you are working at any given time is called the active window. There are six different
windows that can be opened when using SPSS. The following will give a description of each of them.

Data Editor Window. This window shows the contents of the current data file. A blank data
editor window, as shown in figure 4.1, automatically opens when you start SPSS for Windows;
only one data window can be opened at a time. From this window, you may create new data files
or modify existing ones.
98

Output Viewer Window. This window displays the results of any statistical procedures you run,
such as descriptive statistics or frequency distributions. All tables and charts are also displayed in
this window. The viewer window automatically opens when you create output. Figure 4.2 shows
an output viewer window.

Chart Editor Window. In this window, you can modify charts and plots. For instance, you can
rotate axes, change the colors of charts, select different fonts, and rotate three-dimensional
scatter plots.

Figure 4.1: SPSS Data Editor

Syntax Editor Window. You will use this window if you wish to use SPSS syntax to run
commands instead of clicking on the pull-down menus. An advantage to this method is that it
allows you to perform special features of SPSS that are not available through dialog boxes.
Syntax is also an excellent way to keep a record of your analyses.

99

Figure 4.2: SPSS Output Viewer Window

Pivot Table Editor. Output displayed in pivot tables can be modified in many ways with the
Pivot Table Editor. You can edit text, swap data in rows and columns, add colour, create
multidimensional tables, and selectively hide and show results.

100

Figure 4.3: SPSS Syntax Editor

Figure 4.4: SPSS Chart Editor

Text Output Editor. Text output not displayed in pivot tables can be modified with the Text
Output Editor. You can edit the output and change font characteristics (type, style, color, size).

4.1.1.3 The Main Menu


SPSS for Windows is a menu-driven program. Most functions are performed by selecting an
option from one of the menus. For example, to activate the file menu, either click the mouse on
file or use the keyboard with Alt-F. The main menu bar lists 11 menus:

File. This menu is used to create new files, open existing files, read files that have been created
by other software (e.g., spreadsheets or databases), and print files.

Edit. This menu is used to modify or copy text from output or syntax windows.

101

View. This menu allows you to change the appearance of your screen. You can, for instance,
change fonts, customize toolbars, and display data using their value labels.

Data. Use this menu to make temporary changes in SPSS data files, such as merging files,
transposing variables and cases, and selecting subsets of cases for analyses. Changes are not
permanent unless you explicitly save the changes.
Transform. The transform menu makes changes to selected variables in the data file and
computes new variables based on values of existing variables. Transformations are not
permanent unless you explicitly save the changes.

Analyze. Use this menu to select a statistical procedure to be performed such as descriptive
statistics, correlations, analysis of variance, and cross-tabulations.

Graphs. This menu is used to create bar charts, pie charts, histograms, and scatter plots. Some
procedures under the Analyze menu also generate graphs.

Utilities. This menu is used to change fonts, display information on the contents of SPSS data
files, or open an index of SPSS commands.

Window. Use the window menu to arrange, select, and control the attributes of the SPSS
windows.

Help and add-on. These menus open a Microsoft Help window containing information on how to
use many SPSS features.

4.1.1.4 Some Mathematical Expressions and Logical or Relational Operators


Some Mathematical Expressions

+ , addition

-, subtraction

/ , division
102

*, multiplication

**, exponentiation

abs(x) returns the absolute value of x.

exp(x) returns the exponential function of x.

int(x) returns the integer by truncating x towards zero.

ln(x), log(x) returns the natural logarithm of x if x>0.

log10(x) returns the log base 10 of x if x>0.

max(x1,...,xn) returns the maximum of x1, ..., xn.

min(x1,...,xn) returns the minimum of x1, ..., xn.

round(x) returns x rounded to the nearest whole number.

round(x,y) returns x rounded to units of y.

sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.

sqrt(x) returns the square root of x if x>=0.

Logical Operators

& and

| or

! not

not

Relational Operators

greater than

< less than

>= greater or equal

<= smaller or equal

= equal(for conditional statements)

!= not equal

103

4.1.2 Data Management


Data can be entered directly into SPSS, or it can be imported from a number of different sources.
The processes for reading data stored in SPSS data files, spreadsheet applications, such as
Microsoft Excel, database applications, such as Microsoft Access, and text files are all discussed
in this chapter.

4.1.2.1 Entering Your Own Data


To begin entering data in the data editor, follow these steps:
1. Click on File from the menu bar.
2. Click on New and then Data from the file pull-down menu.
3. Click on the cell in which you wish to enter data (or use the arrow keys to highlight the
cell). A darkened border will appear around the cell; this tells you that this is the cell you
have selected.
4. Type in the value you wish to appear in that cell and then press Enter.
5. Repeat this process until you have entered all of the data you wish for column 1 (values
for all cases on variable 1).
6. When you are ready to add another variable, click on the first cell in the next column
(row 1, column 2).
7. Repeat this process for all values in column 2.
8. Continue this procedure until you have entered values for all cases and variables that you
wish for your data file.

Once you have entered data in the data editor, you may change or delete values. To change or
delete a value in a cell, simply click on the cell you wish to alter. You will notice that a dark
border appears around the selected cell, and the value in the cell appears at the top of the data
editor. If you are changing the value, simply type the new value and press enter.

Adding Cases and Variables


To insert a new case (row) between cases that already exists in your data file:
104

1. Point the mouse arrow and click on the row number below the row where you wish to
enter the new case. The row should be highlighted in black.
2. Click on Data on the menu bar.
3. Click on Insert Cases from the pull-down menu. A new row is now inserted and you may
begin entering data in the cells. Notice that before you enter your values, all of the cells
have system-missing values (represented by a period).

To insert a new variable (column) between existing variables:


1. Click on the column variable name that is to the right of the position where you wish to
enter a new variable. The column should be highlighted in black.
2. Click on Data on the menu bar.
3. Click on Insert Variable from the pull-down menu. A new variable (column) is now
inserted and you may begin entering data in the cells.

Deleting Cases and Variables


To delete a case:
1. Click on the case number that you wish to delete.
2. Click on Edit from the menu bar.
3. Click on Clear. The selected case will be deleted and the rows below will shift upward.

To delete a variable:
1. Click on the variable name that you wish to delete.
2. Click on Edit from the menu bar.
3. Click on Clear. The selected variable will be deleted and all variables to the right of the
deleted variable will shift to the left. Deleting variables can also be accomplished using
SPSS syntax with the Drop and Keep subcommands.
Defining Variables
By default, SPSS assigns variable names and formats to all variables in the SPSS data file. By
default, variables are named VAR##### (prefix VAR followed by five digits) and all values are
valid (blanks are assigned system missing values). Most of the time, however, you will want to
105

customize your data file. For example, you may want to give your variables more meaningful
names, provide labels for specific values, change the variable formats, and assign specific values
to be regarded as missing.
To do any or all of these:
1. First, make sure that your data file window is the active window and click on the variable
name that you wish to change.
2. Click on the Variable View tab or else double-click on the variable name in the data
editor.
3. Type the name of the variable in the Name column. Variable names have to be unique,
begin with a letter, and cannot contain blank spaces.
4. If you wish to change the type or format of a variable, click the button in the Type cell to
open the Variable Type dialog box. By default, all variables are numeric, but you may
work with other types such as names, dates, and other non-numeric data.
5. Suppose you have a variable representing average cost of groceries per person that was
entered to the nearest cent (e.g., 32.24) and you want to change this format so that the
average cost is displayed as a whole number (rounded to the nearest dollar, e.g., 32) click
in the button Decimal places box. To change the format of the numeric variable, click in
the Width box.
6. If one of your variables is categorical, you can assign numbers to represent the categories
of the variable. For example, the variable sex will have 2categories: male and female.
Males may have the assigned value 1 and 2 represents females. It is useful to have
descriptive labels assigned to the values of 1 and 2 so that it is easy to see which number
represents which category in your output files.
7. If there are specific values that you would like to be treated as missing values, click on
Missing to open the Missing Values dialog box. Click on Discrete Missing Values to tell
SPSS that you have specific values that are considered to be missing. Type the value(s) in
the boxes (you may have up to three values). If you have more than three missing values,
click on Range plus one optional discrete missing value and enter the lower and upper
bounds of the discrete variable. Click OK when you have entered in all of your missing
values.
106

Reading SPSS Data Files


We will illustrate how to read an existing SPSS data file. The reader may follow along using the
data accompanying this guide.
To open a data file:
1. Click on File from the menu bar.
2. Click on Open on the file pull-down menu.
3. Click on Data on the open pull-down menu. This opens the Open File dialog box as
shown in Figure 4.5.
4. Choose the correct directory from the Look in: box at the top of the screen.
5. Point the arrow to the data file you wish to open and click on it.
6. Click on Open.

Note: Most of the examples in the following chapters use the SPSS data files that are provided
with this manual. Unless you are required to enter data on your own into a new file, all
procedures assume that you have opened the SPSS data file before beginning any computations
or analyses.

Reading Data Files in Text and Other Formats


To read a text data file, begin at the main menu bar in the Data Editor window:
1. Click on File.
2. Click on Read Text Data.
3. Select the appropriate file from the Open file dialog box and click Open.
4. Follow the steps in the Text Import Wizard to read the data file. You will have to answer
questions about type of data, arrangement of data, number of cases to import, and missing
values. Use the Help button of the Text Import Wizard for more detailed information.

To open data from a file such as an Excel spreadsheet, begin at the Data Editor window:
1. Click on File.
2. Click on Open and then click on Data.
107

3. Select the file format from the drop-down list of file types in the Files of type: box.
4. Choose the appropriate directory and file.
5. Click on Open.

Figure 4.5: Open File Dialog Box

Saving Data Files


Unless you save your files, all of your data and changes will be lost when you leave the SPSS
session. To save a file, first make the Data Editor the active window. Then:
1. Click on File from the menu.
2. Select Save from the list of options in the File pull-down menu.
3. Select the appropriate directory in the Save in: box. Type the name of your file in the
File name box. Notice that the default file type is set for SPSS format as indicated by the
.sav extension.
4. Click on Save.
108

By default, this will save the data file as an SPSS data file. If you were working with a
previously existing data file, the old file will be overwritten by the modified data file. To save
the file with a different name, select Save As from the File pull-down menu.
If you wish to save the data file in a format other than SPSS (e.g., Lotus, Excel, dBASE, fixedformat ASCII text):
1. Click on File from the menu.
2. Select Save As from the list of options in the File pull-down menu.
3. Select the appropriate directory in the Save in: box. Type the name of your file in the File
name box.
4. Choose the appropriate file type in the Save as type: box.
5. Click on Save.

4.1.2.2 Transforming Variables and Data Files


At times, you may need to alter or transform the data in your data file to allow you to perform
the calculations you require. There are many ways in which you can transform data. This section
discusses three commonly used techniques: computing new variables, recoding variables, and
selecting subsets of cases.

Computing New Variables


There may be occasions when you need to compute new variables that combine or alter existing
variables in your data file. For instance, your data file may contain daytime and nighttime
sleeping hours for a sample of infants, but you are interested in examining total sleep hours (i.e.,
the sum of the separate daytime and nighttime hours).

To create a new variable:


1. Click on Transform from the menu bar.
2. Click on Compute from the pull-down menu. This opens the Compute Variable dialog
box (see Fig.4.6).

109

3. Enter the name of the new variable (in the above illustration, total) in the Target Variable
box. (You also have the option to describe the nature and format of the new variable by
clicking on the Type & Label box.)
4. You will then need to perform a series of steps to construct an expression used to
compute your new variable. In this illustration, you would first select the daytime
variable (daysleep) from the variable list box on the lefthand side of the dialog box and
move it to the Numeric Expression box using the right directional arrow.
5. Then click on the + from the calculator pad. You will notice that a plus sign is placed
in the Numeric Expression box after the word daytime.
6. Complete the expression by selecting the nighttime variable (nightsleep) and moving it
to the Numeric Expression box, following the instructions in step (4) above.
7. When you have completed the expression, click on OK to close the Compute Variable
dialog box. Your new variable will be added to the end of your data file.
In addition to simple algebraic functions on the calculator pad (+, -, x, ), there are many other
arithmetic functions such as absolute value, truncate, round, square root, and statistical functions
including sum, mean, minimum, and maximum. These are displayed in the Function group box
to the right of the calculator pad. First, select a procedure in the Function group window, and
then select the specific function in the Functions and Specific Variables window.

4.1.2.3 Using SPSS Syntax


As illustrated throughout this book, most SPSS procedures are conducted using the pull-down
menus because they are convenient and easy to use. However, an alternative way to run SPSS
procedures is through command syntax. SPSS commands are the instructions that you give the
program for conducting procedures.

SPSS syntax commands are typed into a command file using the SPSS syntax editor. Syntax files
have the extension .sps. There are several reasons why command syntax is useful, such as
when the user wants to: (1) have a record of the analyses conducted during a session; (2) repeat
long and complex analyses; (3) review how variables were created or transformed; and (4)
modify commands to run slightly different or customized statistics.
110

When working with syntax, the user must enter commands instructing the program what
procedures to conduct. You can enter syntax by either typing or pasting syntax into the syntax
editor. Because most users do not know the commands from memory, it is useful to refer to the
SPSS Syntax Reference Guide for a complete reference to the command syntax. Help is also
available by using the Help button on the toolbar in the syntax editor window. Pasting syntax
commands from dialog boxes is perhaps the easiest way to construct syntax commands. Rather
than typing the commands, you initiate a procedure using pull-down menus and then instruct
SPSS to provide the commands and paste them into the syntax editor.

To open a new window and begin typing commands:


1. Click on File from the main menu.
2. Click on New from the pull-down menu.
3. Click on Syntax to open the SPSS syntax editor (see Fig. 4.7).
4. Begin typing syntax into the editor.

111

Figure 4.6: Compute Variable Dialog Box

For example, suppose you want to open the sleep.sav data file, but you only want to read a subset
of variables body weight, total sleep, and danger index.
The syntax command would be:
GET FILE = SLEEP.
/KEEP = BODY WT TOTSLEEP DANGER

Figure 4.7: SPSS Syntax Editor

You can also run a procedure by pasting syntax from a dialog box. When you use the paste
button, SPSS creates the syntax commands to execute procedures requested from pull-down
menus. For example, to compute a new variable (total sleep hours) as shown in session 4.1.2.2,
follow steps 16. Instead of clicking on OK, click on the Paste button. The compute commands
will automatically be displayed in a syntax window. To run the syntax commands, click the
Right arrow button on the toolbar.

112

Once you have created a syntax file, you can save it using the same procedures described in
Session 4.1.2.1 of this chapter. The file can then be opened and edited for future modifications.
Make sure when you open, edit, and save a syntax file that you correctly identify it with the
.sps file type.

4.1.3 Descriptive Statistics


A statistical data set consists of a collection of values on one or more variables. The variables
can be either numerical or categorical. Numerical variables are further classified as discrete or
continuous. These distinctions determine the statistical approaches that are appropriate for
summarizing the data. Examples of data include

crime rates for large cities across Africa;

body temperatures for a randomly chosen sample of adults;

The basic features of any data can be presented in the form of:
Graphical displays
Tabular descriptions
Summary statistics
Linear regressions

4.1.3.1 Tabular description


One approach to organizing data is by using tables. The type of table you use depends in part on
the way the data are measured in categories (e.g., occupations) or on a numerical scale (e.g.,
number of errors). This chapter demonstrates how to examine different types of data through
frequency distributions.

Summarizing Categorical Data


113

Categorical variables are those that have qualitatively distinct categories as values. For example,
gender is a categorical variable with categories male and female.

Frequencies
One way to display data is in a frequency distribution, which lists the values of a variable (e.g.,
for the variable region: Accra, Kumasi, Volta, etc.) and the corresponding numbers and
percentages of people for each value. Let us begin by creating a simple frequency distribution of
Regions using the sec7.sav SPSS data file from the GLSS5 accompanying this manual. Follow
along by using SPSS to open the data file on your computer (using the procedure given in
Chapter 2). This data set was used in a study of the Housing Characteristics in Ghana.
Notice that the data view lists numbers as the values for all of the variables, even though the
variable is a categorical variable. To see the categories each of the values represents, you can
examine the contents of the data file (variable labels, variable type, and value labels) by clicking
on Utilities on the menu bar and clicking on Variables from the pull-down menu.

To create a frequency distribution of the region variable:


1. Click on Analyze from the menu bar.
2. Click on Descriptive Statistics from the pull-down menu.
3. Click on Frequencies from the second pull-down menu to open the region dialog box.
4. Click on the label/name of the variable you wish to examine (region) in the left-hand
box.
5. Click on the right arrow button to move the variable name into the Variable(s) box.
6. Click on OK.

The frequency distribution produced by SPSS is shown in Figure 4.8. This figure shows the
content of the output that which is in the right-hand frame of your Output Viewer. The
Statistics table in the output indicates the number of valid and missing values for this variable.
There are 8687 valid cases and no missing values. The Region table displays the frequency
distribution.

114

For example, there are 834 people in the Western region and 1257 people in the Greater Accra
region. The numbers in the Percent column represent the percentage of the total number of
cases that are in each region. These are obtained by dividing each frequency by the total number
of cases and multiplying by 100. For example, 18.1% of the people are in the Ashanti region.
Statistics
Region
N
Valid

8687

Missing

Region

Valid

Western

Frequency
834

Percent
9.6

Valid Percent
9.6

Cumulative
Percent
9.6

central

689

7.9

7.9

17.5

greater accra

1257

14.5

14.5

32.0

volta

720

8.3

8.3

40.3

eastern

914

10.5

10.5

50.8

ashanti

1574

18.1

18.1

68.9

brong ahafo

795

9.2

9.2

78.1

northern

795

9.2

9.2

87.2

upper east

600

6.9

6.9

94.1

upper west

509

5.9

5.9

100.0

Total

8687

100.0

100.0

Figure 4.8: Frequency Distribution of number of people in the various regions of Ghana

The Valid Percent column takes into account missing values. In this case, there are no missing
values, so the Percent and Valid Percent columns are the same. The Cumulative Percent is
a cumulative percentage of the cases for the category and all categories listed before it in the
table.

Worked example 2
115

Draw a table showing the variation in cooking fuel in the urban areas of the Greater Accra
Region. (Use the data in the file sec7.sav, from the GLSS5 accompanying this manual).
Solution;
1. Click on Data to open the data pull down menu
2. Click on select cases. (To open the select cases pop-up menu )
3. Click on if condition is satisfied, click on the if button
4. Type if region=3 & loc=1%, and click OK
5. Click on Analyze from the menu bar.
6. Click on Descriptive Statistics from the pull-down menu.
7. Click on Frequencies from the second pull-down menu to open the region dialog box.
8. Click on the label/name of the variable you wish to examine (Main fuel used for
cooking) in the left-hand box.
9. Click on the right arrow button to move the variable name into the Variable(s) box.
10. Click on OK.

Figure 4.9 shows the contents of the output.


Main fuel used for cooking

Valid

None,No Cooking
Wood
Charcoal
Gas

Frequency
4

Percent
1.4

Valid Percent
1.4

Cumulative
Percent
1.4

61

20.7

20.7

22.1

185

62.9

62.9

85.0

39

13.3

13.3

98.3

Electricity

.3

.3

98.6

Kerosene

1.4

1.4

100.0

294

100.0

100.0

Total

Figure 4.9: Frequency Distribution of Main fuel used for cooking in Ghana

4.1.3.2 Graphical displays


One approach to organizing data is through a chart or graph. The type of chart you use depends
in part on the way the data are measured in categories (e.g., occupations) or on a numerical

116

scale (e.g., number of errors). This chapter demonstrates how to examine different types of data
through graphical representations.

Figure 4.10: Frequencies Charts Dialog Box


4.1.3.3 Bar Charts, Pie Charts, Histogram and Line Graphs
These charts are useful for examining categorical data. In a bar chart and histogram, the height of
each bar represents the frequency of occurrence for each category of the variable. Let us create a
bar chart for the region data using an option within the Frequencies procedure. From the
Frequencies dialog box (see steps 13 of the Frequencies section):
1. Click on Charts to open the Frequencies Charts dialog box (see Fig. 4.10).
2. Click on Bar charts in the Chart Type box.
3. Choose the type of values you want to chart frequencies or percentages in the Chart
Values box. For this example, we have selected frequencies.
4. Click on Continue.
5. Click on OK to run the chart procedure.

A bar chart like that in Figure 4.11 should appear in your SPSS Viewer. The information
displayed in this chart is a graphical version of that shown in the frequency distribution in Figure
4.8. The region with the greatest number of people is the Ashanti region.
Worked example 1
Draw bar graphs to show the Rural Urban correlation for the various Regions. (Use the data in
the file sec7.sav, from the GLSS5 accompanying this manual).
Solution;
117

1. Click on Graphs to open the graphs pull down menu


2. Click on Bar charts. (To open the bar chart pop-up menu )
3. Click on clustered, select summaries for group of cases and click on define
4. Select % of cases, move the region variable to the category axis and the rural/urban
variable to the define clusters by:
5. Click on OK to run the chart procedure
A bar chart like that in Figure 4.12 should appear in your SPSS Viewer.

Region

1,500

Frequency

1,000

500

0
Western

central

greater
accra

volta

eastern

ashanti

brong
ahafo

northern

upper
east

upper
west

Region

Figure 4.11: Bar chart of number of people in the various regions of Ghana

4.1.3.4 Summary statistics


118

Summarizing Numerical Data


There are two types of numerical variables discrete and continuous. The values for discrete
variables are counting numbers. For example, an American football game is won by one, two, or
three points, not a quantity in between. Continuous variables, on the other hand, do not have such
indivisible units. Body temperature, for instance, can be measured to the nearest degree, half
degree, quarter-degree, and so on. For practical purposes in SPSS, there is no difference in
summarizing these two types of numerical data.

urban/rural-corr
urban
rural

Percent

30.0%

20.0%

10.0%

0.0%
Wester
n

central

greater
accra

volta

eastern ashanti

brong
ahafo

norther
n

upper
east

upper
west

Region

Figure 4.12: Bar graphs showing the Rural Urban correlation of the various regions in Ghana

119

4.1.3.5 Mean, Sum, Standard Deviation, Variance, Minimum Value, Maximum Value, and
Range
When generating these statistics, the Data Editor must be open with the appropriate data set
before continuing.

Worked Problem
Using the data in the file sec7.sav, determine the mean, sum, standard deviation, variance,
minimum value, maximum value, and range for s7fq6 only.
Solution
1. Repeat steps 12 of the Frequencies section, select Descriptives. This will open the
Descriptives dialog box as shown in Fig.4.13.

Figure 4.13: Descriptives Dialog Box

2. In the variable list, select the variable Area in square meters. Left click on the right arrow
button between the boxes to move this variable over to the Variable(s) box. To calculate
statistics for many variables, simultaneously add variables to the Variable(s) box.
3. Click on the Options button. This will open the Descriptives: Options dialog box.
4. Click on mean, sum, standard deviation, minimum value, maximum value, and range.
5. Click on the Continue button when done.

120

6. Click OK. The Descriptives dialog box closes and SPSS activates the Output Navigator
to illustrate the statistics.

4.1.3.6 Measures of Central Tendency and Measures of Variability


Measures of central tendency or location specify the center of a set of measurements and
Measures of variability indicate how spread out the observations are, that is, how much the
values differ from individual to individual. This chapter describes ways to use SPSS to obtain
three common measures of location the mode, the median, and the mean of a sample. How
SPSS can be used to obtain measures of variability such as range, standard deviation etc., is
discussed in the beginning of this chapter. Measures of central tendency and variability can be
used to:

find the most common college major for a group of students;

find the midpoint of a set of ordered body weights that divides the set in half;

calculate the average gross of the top movies from a given year;

find the difference between the largest and smallest salary paid to people working at a
particular company;

determine how daily hours of sleep vary among different species of mammals;

121

Figure 4.14: Descriptives of areas in square meters for households in Ghana.

The Mode, Median and Mean


The mode, especially useful in summarizing categorical or discrete numerical variables, is the
category or value that occurs with the greatest frequency. One way to obtain the mode with SPSS
for Windows is by using the Frequencies procedure. This is the same procedure used to obtain
frequency distributions, histograms, and bar charts as discussed. To obtain the mode
of any variable:
1. Click on Analyze from the menu bar.
2. Click on Descriptive Statistics from the pull-down menu.
3. Click on Frequencies from the pull-down menu.
4. Click on the the variable of interest and then the right arrow button to move the
variable into the Variable(s) box.
5. Click on the Statistics button at the bottom of the screen. This opens the Frequencies:
Statistics dialog box, as shown in Figure 4.15.
6. Click on the Mode option in the Central Tendency section.
122

7. Click on Continue to close this dialog box.


8. Click on OK to close the Frequencies dialog box and execute the procedure.
Notice that the same method employed above, could be used to obtain the median, mean, sum,
percentiles etc.

Figure 4.15: Frequencies: Statistics Dialog Box

Figure 4.16: Descriptives: Options Dialog Box

123

Self Assessment 4.1

(Use the data in the file sec7.sav, from the GLSS5 accompanying this manual). Using SPSS;
4.1.1 With the aid of tables and bar charts, show how access to the different cooking fuels
varies between rural and urban areas for the Greater Accra Metropolitan Area
(GAMA).
4.1.2 Still using tables and bar charts, show how access to the different cooking fuels in
rural and urban areas for Accra compares with one other region of your choice.
4.1.3 With the aid of tables and pie charts, show the distribution of different cooking fuel
usage for all the regions of Ghana
4.1.4 Draw bar graphs to show the Rural Urban correlation for the entire sample in
percentages and actual number of cases

SESSION 4.2: INTRODUCTION TO STATA


Stata is a full-featured statistical programming language for Windows, Macintosh, Unix and
Linux. It can be considered a stat package, like SAS, SPSS, RATS, or eViews. The number of
variables is limited to 2,047 in standard Stata/IC, but can be much larger in Stata/SE or
Stata/MP. The number of observations is limited only by memory. Stata has traditionally been a
command-line-driven package that operates in a graphical (windowed) environment. It contains a
graphical user interface (GUI) for command entry.

124

4.2.1 The Stata Environment


When you start Stata for Windows you will see the following windows, the Command window
where you type in your Stata commands, the Results window where Stata results are displayed,
the Review window where past Stata commands are displayed and the Variables window which
list all the variables in the active data file as shown in figure 4.17. The data in the active data file
can be browsed (read-only) in the Browser window, which is activated from the menu Data/Data
browser or by
browse varlist
where varlist (e.g. income age) is a list of variables to be displayed.

The Editor window as shown in figure 4.18, allows to edit data either by directly typing into the
editor window or by copying and pasting from spreadsheet software
edit varlist
Stata has implemented every Stata command (except the programming commands) as a dialog
that can be accessed from the menus. This makes commands you are using for the first time
easier to learn as the proper syntax for the operation is displayed in the Review window.

4.2.1.1 Stata Toolbar


open: open a stata dataset.
save: save a dataset.
print: print contents of active window.

125

Figure 4.17: Stata Environment

log: to start or stop, pause or resume a log file.


viewer: open viewer window, or bring to the front
graph: open graph window, or bring to the front.
do-file editor: open do-file editor, or bring window to the front.
data editor: open data editor, or bring window to the front.
data browser: open data browser, or bring window to the front.
more: command to continue when paused in long output.
break: stop the current task. This command returns the system to as it was before you issued the
command.

126

Figure 4.18: Stata Toolbar

4.2.1.2 Working Directory


The working directory shown in fig. 4.17, displayed at the bottom left hand corner of the window
is your default directory. Any files you save without specifying a directory will be saved here to
change your working directory, use the cd command: cd directoryname
Note: You are advised to use the cd command at the beginning of your do-files and programs;
this will save a lot of editing if the data you are using is moved.

4.2.1.3 Memory
To change the memory assigned to STATA:
set mem#k
where # is a number greater than the size of the dataset, and less than the total amount of
memory available on your system.
To check the size of the dataset, look in My Computer or your Explorer package. To check the
amount of memory (RAM) your system has available, go to the Start menu and click on
\Settings\Control Panel\System. The bottom line, under General tells you how many KB of RAM
you have available.

127

STATA 10 opens with a default memory of 10.00 MB. To increase the default memory: Right
click on the STATA icon and choose Properties\Shortcut
Edit the Target field to say: \\St-server5\stata8$\wsestata.exe /k#
Where k# is the number of kb you wish to assign to STATA.
Note: If you do not have enough memory available on your machine to read a whole dataset,
open a subset of the variables you need.

4.2.1.4 Where to Get Help


The Stata User's Guide is an introduction into the capabilities and basic concepts of Stata. The
Stata Base Reference Manual provides systematic information about all Stata commands. It is
also often an excellent treatise of the implemented statistical methods. The online help in Stata
describes all Stata commands with its options. However, it does not explain the statistical
methods as in the Reference manual. You can start the online help by issuing the command;
help command

If you don't know the exact expression for the command, you can search the Stata documentation
by;
search word
In both cases the result is written into the result window. Alternatively, you can display the result
in the Viewer window by issuing the command
view help command
or by calling the Stata online help in the menu bar: Help/Search...

4.2.1.5 Some Mathematical Expressions, Logical and Relational Operators


Some Mathematical Expressions

+ addition

- subtraction

/ division

* multiplication

^ exponentiation
128

abs(x) returns the absolute value of x.

exp(x) returns the exponential function of x.

int(x) returns the integer by truncating x towards zero.

ln(x), log(x) returns the natural logarithm of x if x>0.

log10(x) returns the log base 10 of x if x>0.

max(x1,...,xn) returns the maximum of x1, ..., xn.

min(x1,...,xn) returns the minimum of x1, ..., xn.

round(x) returns x rounded to the nearest whole number.

round(x,y) returns x rounded to units of y.

sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.

sqrt(x) returns the square root of x if x>=0.

Logical Operators

& and

| or

! not

not

Relational Operators

> greater than

< less than

>= greater or equal

<= smaller or equal

= = equal(for conditional statements)

!= not equal

129

4.2.2 Data Management


4.2.2.1 Data Entry and Importing Data in Stata
There are two ways of getting data in stata, one way of doing this is manual data entry or
inputting interactively from keyboard. This method is useful for small datasets. For example to
enter data on accident rates (ar) and speed limits (sl) directly into Stata, the syntax is;
input ar sl
1. 4

55

2. 1.5 60
3. 1

4. end
This data could also be entered manually by clicking on the data editor on the toolbar menu; note
that you can copy-and-paste into the data editor. The output is as shown in figure 4.19.

Figure 4.19: Date Editor


130

Inputting from files and spreadsheets (data entry software) is the common way data are brought
into Stata. (Note; excel is not a data entry software).

To prepare data in a data entry software for conversion;

Make sure that missing data values are coded as empty cells or as numeric values (e.g.,
999 or -1). Do not use character values (e.g -, N/A) to represent missing data.

Make sure that there are no commas in the numbers. You can change this under Format
menu, then select Cells... .

Make sure that variable names are included only in the first row of your spreadsheet.
Variable names should be 32 characters or less, start with a letter and contain no special
characters except -.

Under the File menu, select Save As... . Then Save as type Text (tab delimited). The file will be
saved with a .txt extension.
Start Stata. Then issue the following command:
insheet using filename [, clear]
where filename is the name of the tab-delimited file (with extension .txt).
If you have already opened a data file in Stata you can replace the old data file using the option
clear.

4.2.2.2 Opening and Saving Data


To open an existing Stata datafile (extension .dta), type the following command at the command
prompt;
use filename [, clear]
where the option clear clears the dataset already in memory.
To save a datafile in Stata format, type
save [filename]
If file name is not specified, the name under which the data was last known is used. If filename is
specified without an extension, .dta is used.
Stata will look for data or save data or save a log file in the drive and directory specified by
131

cd drive:directory
See help memory if you encounter memory problems when loading a file.
4.2.2.3 Creating new variables
New variables are created by the following syntax;
generate newvar = expression [if expression]
where newvar is the name of the new variable and expression is a mathematical function of
existing variables. The if option applies the command only to the data specified by a logical
expression. The (system) missing value code . is assigned to observations that take no value.
Some examples:
generate age2 = age^ 2
generate agewomen = age if women = = 1
generate rich = 0 if wealth != .
replace rich = 1 if wealth >= 1000000
generate rich = wealth >= 1000000

4.2.2.4 Changing Existing variables


Existing variables can be changed by the syntax below;
replace oldvar = expression [if expression]
or by double clicking on the variable name in the data editor to open the variable properties
dialog box as shown in fig.20, and typing the new variable name in the name edit box.

132

Figure 4.20: Variable Properties

The command egen extends the functionality of generate. For example


egen average = mean(income)
creates a new variable containing the (constant) mean income for all observations. See the last
section for some available functions.
Both the generate and the egen command allow the by varlist prefix which repeats the command
for each group of observations for which the values of the variables in varlist are the same. For
example,
sort nationality
by nationality: egen referenceinc = mean(income)
generates the new variable referenceinc containing for each observation the mean income of all
observations of the same nationality. Note that the data has to be sorted by nationality
beforehand.
The recode command is a convenient way to exchange the values of ordinal variables:
recode var (rule1) [(rule2)]
133

e.g. replace gender (1=0) (2=1) will produce a dummy variable.


The following system variables (note the -) may be useful:
_n contains the number of the current observation.
_N contains the total number of observations in the dataset.
_pi contains the value of pi to machine precision.
A lagged variable can be created in the following way: First define a time series index. Second
declare the data a time series. For example this can be done with the commands
generate t = _n /* generate a variable with values 1...N */
tsset t /* declare the time series */
Lagged values can now be designated as L.varname. For example L.gdp designates a lagged
value of the variable gdp, L2.invest designates the variable invest lagged twice.

4.2.2.5 Labelling Values


The command label values, attaches a value label to a variable. If no value label is specified,
any existing value label is detached from that variable. The value label, however, is not deleted.
Value labels may be up to 32,000 characters long.
To define the value label yesno, use the syntax;
label define yesno 1 "no" 2 "yes"
meaning the variable no is labeled 1 and yes 2. Remember that value labels may include many
associations and typing them all on one line can be ungainly or impossible.

4.2.2.6 Deleting variables


You can delete variables from the dataset by either specifying the variables to be dropped or to
be kept:
drop varlist
keep varlist
You can delete observation from a dataset by specifying the observations to be dropped (or kept)
by either logical expression or by specifying the last and first observation;
drop [if expression] [in range first/last]
keep [if expression] [in range first/last]
134

4.2.2.7 Sorting Variables


Arrange the observations of the current dataset in ascending order with respect to varlist
sort varlist
Change the order of the variables in the current dataset:
order varlist
by specifying a list of variables to be moved to the front of the dataset. You can convert the data
into a dataset of the means (or other statistics see help) of varlist. varname speci_es the groups
over which the means are calculated.
collapse varlist, by(varname)
A description of the variables in the dataset is produced by describe and codebook [varlist].

4.2.2.8 Commands and Variables


It is possible to scroll through past commands by using the page up and page down buttons on
your keyboard. Alternatively you can double click on a command in the Review window and it
will appear in your Command window. Similarly you can click on any variable that appears in
the Variables window and they will appear in the Command window (or wherever the Target in
the Variables window specifies).

4.2.2.9 Command Interface


There have been some significant changes in STATA. One of the main ones is that, it now has a
Statistics Menu in the style of SPSS. This enables the user to select an item from a pull down
menu which opens a dialogue box in which you can build STATA commands. The detail on how
to use this method of analysing data is discussed in the first part of this chapter, ie, introduction
to SPSS. Users of STATA are encouraged to learn the commands so that they can write do-files
and programs.
However, one point that may be useful: The command issued by the dialogue box is submitted as
if you typed it by hand. Therefore if you cannot remember the syntax of a command, using the
dialogue box and then checking the command in the Review window is a good way to get a
reminder.
135

4.2.2.10 Files Extensions


Data file

filename.dta

Do file

filename.do

Dictionary file

filename.dct

Log file

filename.scml

(only readable in stata)

Log file

filename.log

(text file)

(program file)

4.2.2.11 Opening Files


Most of the commands discussed below can also be run from the toolbar or the menus, however
in this document the syntax of typed commands are discussed.
To open a file: use the following syntax
usefilename, clear
usevarlistusing filename, clear [for a subset of the data file]
In some cases you may get the message no room to add more observations or no room to add
more variables. This is because not enough memory has been assigned to STATA.

4.2.3 Descriptive Statistics In Stata


In terms of statistics, Stata provides all of the standard univariate, bivariate and multivariate
statistical tools, from descriptive statistics and t-tests through one-, two- and N-way ANOVA,
regression, principal components, and the like. Statas regression capabilities are full-featured,
including regression diagnostics, prediction, robust estimation of standard errors, instrumental
variables and two-stage least squares, seemingly unrelated regressions, vector autoregressions
and error correction models, etc. It has a very powerful set of techniques for the analysis of
limited dependent variables: logit, probit, ordered logit and probit, multinomial logit etc.

Just like in SPSS, the basic features of any data in Stata can be presented in the form of:
Graphical displays
Tabular descriptions
Summary statistics
136

Linear regressions

4.2.3.1 Graphical displays


Stata graphics are excellent tools for exploratory data analysis, and can produce high-quality 2-D
publication-quality graphics in several dozen different forms. Every aspect of the graphics may
be programmed and customized, and new graph types and graph schemes are being
continuously developed. The programmability of graphics implies that a number of similar
graphs may be generated without any pointing and clicking to alter aspects of the graphs. Stata
does not have 3-D graphics capabilities, but those are under development in the new graphics
system.
To draw a scatter plot of the variables yvar1 yvar2 ... (y-axis) against xvar (x-axis): the syntax is
scatter yvar1 yvar2 ... xvar
To draw a line graph, i.e. scatter with connected points
line yvar1 yvar2 ... xvar
To draw a histogram of the variable var
histogram var
To draw a scatter plot with regression line:
scatter yvar xvar || lfit yvar xvar
4.2.3.2 Summary Statistics
To display univariate summary statistics of the variables in varlist: type
summarize varlist
at the command prompt

4.2.3.3 Tabular descriptions


Report the frequency counts of varname:
tabulate varname [if expression] [, missing]
The missing option requests that missing values are reported.
To display the correlation or covariance matrix for varlist, use the syntax
correlate varlist
137

To produce a two-way table of absolute and relative frequencies counts along with Pearson's chisquare statistic:
tabulate var1 var2, col chi2
To perform a two-sample t-test of the hypothesis that varname has the same mean within the two
groups defined by the dummy variable groupvar
ttest varname [if exp], by(groupvar) [ unequal]
where the option unequal indicates that the two-sample data are not to be assumed to have equal
variances.

4.2.3.4 Regression
To regress a dependent variable depvar on a constant and one or more independent variables in
varlist use
regress depvar [varlist] [if exp] [, level(#) noconstant]
the if option limits the estimation to a subsample specified by the logical expression exp. The
noconstant option suppresses the constant term.
level(#) specifies the confidence level, in percent, for confidence intervals of the coefficients. See
help regress for more options.
You can access the estimated parameters and their standard errors from the most recently
estimated model;
coef[varname]

contains the value of the coe_cient on varname

se[varname]

contains the standard error of the coe_cient

Stata calculates predictions from the previously estimated regression by


predict newvarname [, stdp]
The stdp option provides the standard error of the prediction.
[post-estimation commands: predict, cve, ...]

4.2.3.4 Log Files


A log file keeps a record of the commands you have issued and their results during your Stata
session. You can create a log file with;
log using filename [, append replace text]
138

where filename is any name you wish to give the file. The append option simply adds more
information to an existing file, whereas the replace option erases anything that was already in the
file. Full logs are recorded in one of two formats: SMCL (Stata Markup and Control Language)
or text (meaning ASCII). The default is SMCL, but the option text can change that.
A command log contains only your commands
cmdlog using filename
Both type of log files can be viewed in the Viewer:
view filename
You can temporarily suspend, resume or stop the logging with the command:
log f on | off | close g
cmdlog f on | off | close g

4.2.3.5 Do-Files
A do-file is a set of commands just as you would type them in one-by- one during a regular Stata
session. Any command you use in Stata can be part of a do file. The default extension of do-files
is .do, which explains its name. Do-files allow you to run a long series of commands several
times with minor or no changes. Furthermore, do-files keep a record of the commands you used
to produce your results.
To edit a do-file, just click on the new do file icon in the toolbar. To run this file, save it in the
do-file editor and issue the command:
do mydofile
You can also click on the Do current file icon in the do-file editor to run the do-file you are
currently editing.
Comments are indicated by a * at the beginning of a line. Alternatively, what appears inside /* */
is ignored. The /* and */ comment delimiter has the advantage that it may be used in the middle
of a line. Appendix A shows some typical do files.

Worked Example 1
You are required to use Stata to analyze data from the 5th Ghana Living Standards Survey
(GLSS5) on the use of electricity for lighting and traditional fuels for cooking.
139

1. Generate a table showing total household income, main source of lighting and main fuel
used for cooking for all households covered in the GLSS5.
Solution
Command:
tabstat category1 totalincome s7dq13 s7dq11,by(hhid) by(loc2) columns(variables)
where; totalincome is generated from the data by using ;
gen totalincome=totemp+agric1c+agric2c+nsfey1+nsfey2+nsfey3+import+remitinc+
otherinc
and category1 is the income quintiles, to obtain the income quintiles, use the command
below;
pctile category= totalincome,nq(5)
xtile category1= totalincome,cut(category)

A table like that in figure 4.21 should appear in your result window.

140

Figure 4.21: A table of total household income, main source of lighting and main fuel used for
cooking, for all households in Ghana.

Worked Example 2
Generate a bar chart to show the % distribution by income quintile for households in Ghana who
use electricity as their main source of lighting.
Solution
Command:
141

graph bar (count) totalincome if s7dq11==1,over( category1) asyvars percentages


Figure 4.22 shows the content of the output.

DISTRIBUTION OF ELECTRICITY AS COOKING FUEL


FOR THE VARIOUS INCOME QUINTILES IN GHANA

Number of households

10

10

Group One

Group Two

Group Three

Group Four

Group Five

Source:Fifth Ghana Living Standards Survey

Figure 4.22: A bar chart of distribution of electricity as fuel for cooking, for the various income
quintiles in Ghana.

Self Assessment 4.2

Use Stata to analyze the data from the 5th Ghana Living Standards Survey (GLSS5),
accompanying this manual.

4.2.1 Generate a bar chart to show the % distribution by income quintile for households in
Ghana which use traditional energy sources (Wood, Charcoal, Crop Residue/Sawdust,
Animal Waste and Other) as their main fuel for cooking.

142

4.2.2 With the aid of tables and bar charts, show how access to LPG varies between rural
and urban areas for the Greater Accra region (GAR).
4.2.3 Still using tables and bar charts, show how access to LPG in rural and urban areas for
GAR compares with those for two other regions of your choice, both of which should
not be in the same ecological zone.

Learning Track Activities

Unit Summary

Statistical analysis softwares increase the accuracy and speed of analysing, especially,
sophisticated data. Planning and good policy can only be done more accurately, if the
requisite data analysis is done and done correctly. SPSS and STATA are some of the
common statistical analysis softwares that could be used in statistical analysis of data, such
as, the census data.

SPSS for windows is a menu-driven program, ie., most functions are performed by selecting
an option from one of the menus. Users have less control over statistical output than for
example, Stata or Gauss users.

Even though Stata could also be used as a menu-driven program, it is traditionally


command- line-driven package that operates in a graphical windowed environment. In Stata
researchers have greater control over the equations or the output.

143

Key terms/ New Words in Unit

1. Toolbar
2. SPSS/STATA
3. Data editor
4. GLSS5

Unit Assignments 4
Use Stata to analyze the data accompanying this manual, from the 5th Ghana Living
Standards Survey (GLSS5).
If a question involves drawing table(s), submit your results and commands in a log
format.
For problems involving drawing graphs, write a do file to draw those graphs.

1. With the aid of tables and pie charts, show the variation between rural and
urban usage of LPG for the whole country and compare with those for
Greater Accra Region and any two regions of your choice.
2. With the aid of tables and bar charts, show how access to main source of
lighting varies between rural and urban areas of Ghana as a whole.
3. Again using tables and bar charts, show how the main source of lighting in
rural and urban areas of Ghana compare with those for any one region of
your choice.
4. Still with the aid of tables and bar charts, show how access to electricity as
main source of lighting varies between the region of your choice and Ghana
as a whole for each total household income quintile in Ghana.

144

Unit 5
INTRODUCTION TO JOURNAL ARTICLES,
CONFERENCE PAPERS AND THESES WRITING
Introduction
Research findings are meant to be published so as to add to the body of knowledge in that
particular field of study. Research reports/papers, theses, journal articles and conference papers
are the widely used means of publishing research findings for the benefit of all interested parties.
This unit will guide students through the preparation of research reports/papers, theses, journal
articles and conference papers with proper referencing.

Learning Objectives
After reading this unit you should be able to:
1. Write journal articles and conference papers
for publications.
2. Prepare a full research report or thesis with
proper referencing.

UNIT CONTENT
SESSION 5.1: RESEARCH AND THESIS REPORTS
5.1.1 Thesis Report Writing
5.1.2 Research Proposal Writing
SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER PREPARATION
SESSION 5.3: ABSTRACTS AND SUMMARIES AND REFERENCING
5.3.1 Abstracts and Summaries
5.3.2 Referencing
5.3.3 Referencing Formats
5.3.4 Introduction to referencing software packages
145

SESSION 5.1: RESEARCH AND THESIS REPORTS


5.1.1 Thesis Report Writing
A thesis is a document submitted in support of candidature for an academic degree or
professional qualification in which the author's research and findings are presented. It is a
demonstration of a graduate student's ability to explore, develop, and organize materials
relating to a certain topic or problem in a field of study. The main aim of a thesis or project
is not only to pursue research and investigation, but also to write an extended scholarly
statement clearly, effectively and directly addressing the research problem.
Title page
This is made up of the full title of the thesis, the name and previous qualification of the
author, the Department to which it is being submitted, in partial fulfillment of requirement
for what degree and in which Faculty and month or year of presentation.
Abstract
An abstract is a brief summary of the thesis and the most likely part of the thesis to be
widely published and read. It should have a concise description of the problem addressed,
the methodology used, the results as well as conclusions. The abstract should usually be
composed as a single paragraph not exceeding 500 words.
Table of contents
This outlines clearly the chapters and subchapters as well content of the materials within
thesis and the pages where they are located.
List of figures, List of tables, List of Acronyms/Abbreviations
Where figures and tables, a list of tables and figures must be provided showing the pages
where the various figures and tables are located.
A list of acronyms/abbreviations with full explanations
acronyms/abbreviations used must also be provided in this section.

to

the

various

Prefatory matter
Materials pertaining to the preface, foreword acknowledgement and etc may be presented
in this section. The acknowledgement page is however mandatory.
Introduction
146

The introduction provides background information as well as the rationale for the
research work. It also provides information related to the need for the research and in the
process builds an argument for the research and presents research question(s) and aims.
The introduction should also give a detailed description of the various chapters as well as
their contents.

Literature Review
The literature review should provide a detailed account of research works done by other
researchers in the selected area of study, highlighting the merits as well as limitations.
Referencing in this particularly important in this section because it contains, mostly, works from
other researchers. This is where plagiarism becomes an issue. It is also important to discuss theory
which is directly relevant to your research in this section.

Methodology
This section of the thesis presents an understanding of the philosophical framework within
which the research will be carried out and gives the methodological approach as well as a
justification of the chosen methodology. This section should also clearly define the
boundaries of the research in terms of methodological approach and describe steps taken to
ensure ethical research practice.

Results and discussion


Research findings should be clearly reported in this section. Figures and tables should be used
where necessary to provide clarity. This section describes the observations made during the

research and the interpretations given to them. Results could be presented in tables, figures
or both, where possible and clear explanations should be given to them. It is important to
note that, information presented in tables should not be repeated in figures and vice versa.
The discussions could also include references to contemporary literature in the area of the
subject being studied.
Conclusions and recommendations
He section draws all the important arguments and findings together and in the process
providing the reader with a strong sense that the work has been done satisfactorily and that it
was worthwhile. It provides summaries of the major findings and presents limitations as well
as the implications. It is important to end this section on a strong note by suggesting
directions for future research in the respective field.

147

References
This comprises a list of the major works (publications and authorities) consulted in the
course of writing the thesis. See the reference sections of these notes for more details of the
various referencing styles.

Appendices
An appendix provides a place for important information which, if placed in the main text,
would distract the reader from the flow of the argument. Includes raw data examples and
reorganised data (eg, a table of interview quotes organised around themes). Appendices
may be named, lettered or numbered (decide early).

5.1.2 Research Report Writing


Title
The title should be concise, attract attention, and highlight the main point of your paper.
It should be clear about the subject matter and devoid of abbreviations.

Abstract
The abstract is a concise summary of the paper and should be able to tell the reader
whether the paper is worth reading or not. It should therefore be as informative as
possible with respect to the objectives, methodology, results as well as the conclusions. It
should mostly not exceed 300 words.
Introduction
The introduction to a research paper should be as brief as possible and should touch on
background of the research problem, a clear justification of why the research is being
undertaken and also the underlying theory and hypothesis. It should contain a short
review of literature in the field of study and should be limited to a maximum of two
pages.
Materials and Methods
This section of the paper should describe concisely the procedure used to undertake the
research, such date anyone wishing to replicate the study can do so and obtain
148

comparable results. It should be as detailed as possible in order to clear all forms of


ambiguity with regards to design of the research as well as the analysis of the results. In
the situation where known methodologies are used however, the details can be ignored
and instead cited in the reference but modifications to known mythologies should be
clearly explained.
Results and Discussions
This section describes the observations made during the research and the interpretations
given to them. Results could be present in table, figures or both, where possible and clear
explanations should be given to them. It is important to note that, information present in
tables should not be repeated in figures and vice versa. The discussions could also
include references to contemporary literature in the area of the subject being studied.
Conclusions and Recommendations
The conclusions drawn from the results of the research should be briefly and clearly
outlined and the importance of these conclusions should also be stated. All conclusions
should be supported by data presented in the research findings. This section should also
contain recommendations for future research in the respective field of study.

References
The report should include a bibliography or list of literature cited, consisting of
references to original literature relevant to the area of inquiry. It must include, but is not
limited to, all works cited in the text. Students should follow the approved departmental
style manual for the format of the reference.

Self Assessment 5.1

149

SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER


PREPARATION
A Journal Article, sometimes referred to as a Scientific Article, a Peer-Reviewed Article, or a
Scholarly Research Article is the means by which a scholar puts forth the results of an academic
research or information to add to the body of knowledge in their field of study and is usually
published in journals. Conference papers on the other hand are similar to journal articles except
they are delivered at conferences.
Guidelines for journal article/conference paper preparation vary from journal to journal and
from conference to conference but there some basic format that cuts across most of the
journals/conferences. These include title, name of researcher(s) and affiliation(s), abstracts, an
introduction which is made up of background information, problem statement, objectives and
justification of the research topic. The introduction should also give a general overview of the
whole paper.
5.2.1 Title
The title should be concise, attract attention, and highlight the main point of your paper. It should
be clear about the subject matter and devoid of abbreviations.
5.2.2 Authors
The list of authors with their institutional affiliation should be presented immediately after the
title. It should be ordered according to the level of contribution to the paper with the lead
contributor/principal authors name listed first.
5.2.3 Abstract
It is important to provide a abstract of about 350 words which should summarise the entire paper,
highlighting the most important information such as the purpose of the research, methodology
used, results and conclusions.
5.2.4 Introduction
The introduction should provide a background to the research, state the problem briefly and
clearly outline the objectives of the research. It should
5.2.5 Methodology
The methodology tells how the research was conducted. It is important to describe in details the
various processes involved in carrying out the research with illustrations if possible.
150

5.2.6 Results and discussions


The results of the research should be presented in this section and should be in the clearest forms
possible; whether it is text, figures, or tables. It is also important to use text to provide essential
information on figures and tables and be sure to define all terms in the text, figures and tables.
5.2.7 Conclusions and recommendations
State directly and briefly your conclusions and the utility of these conclusions. All conclusions
should be supported by data presented in the paper. Present your recommendations also in this
section of the paper.
5.2.8 References
References should be listed in alphabetical order at the end of the text in this section.

Self Assessment 5.2

151

SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND


REFERENCING
5.3.1 Abstracts and Summaries
An abstract is a brief summary of a research article, thesis, review, conference proceeding
or any in-depth analysis of a particular subject and is mostly used to help the reader
quickly grasp the purpose of the paper. An abstract always appears at the beginning of a
manuscript, acting as the first of call for any given academic paper. It usually contains
between 300 and 500 words.
A summary is an abbreviated version of the most significant points in a book, article,
report or meeting. It is usually about 5% to 15% of the length of the original. It is useful
because it condenses material, informing the reader of the originals most important points.
The commonest of summaries is the executive summary which is mostly for business and
management purposes. It varies from an abstract in that an abstract is usually shorter;
providing a neutral overview or orientation rather than being a condensed version of the
full document. Abstracts are extensively used in academic research where the concept of
the executive summary would be meaningless

5.3.2 Tables and Figures


Research data and results are mostly presented in tables and figures. Tables present lists of
numbers or text in columns, each column having a title or label where as figures are visual
presentations of results, including graphs, diagrams, photos, drawings, schematics, maps,
etc. Graphs are the most common type of figure and will be discussed in detail. When
figures and tables are used in a manuscript, they must be referred to from the text. It is
important to use sentences that draw the reader's attention to the major issues to be
highlighted by referring to the appropriate figure or table. They must also be properly
captioned for clarity.

5.3.2 Referencing
A reference, as defined by the De Montfort University, is the detailed bibliographic
description of the items from which information is gained. The basic idea behind
referencing is to support and identify the evidence you use in your research work. It helps
152

to direct readers of your work to the source of evidence. References can be presented in
two ways; either in-text where it is briey cited within the text, and/or in the reference list
where it is given in full at the end of the work. All items read for background information
but not referred to in the text are usually given in full at the end of the work in a reference
list sometimes referred to as the bibliography. In short, references should;
Enable the reader to locate the sources used for a research work
Help support arguments and add credibility to research work
Show the scope and breadth of a research work
Acknowledge the source of an argument or idea by acknowledging the various authors so
as to avoid plagiarism
5.3.3 Referencing Formats
There are so many referencing styles which can be grouped under two main headings; in-text
name style and the numeric referencing system.
In-Text Name Style
In-text name style involves citing the name(s) of author(s) or organization(s) in the text with
the year of publication. All the sources are then listed in alphabetical order at the end of the
work under any of these headings; References, Reference list, Work cited, Works
consulted or Bibliography depending on the style used. Examples of this style of
referencing include the Author-date (Harvard) style, American Psychological Association
(APA) style, Modern Language Association of America (MLA), Chicago style, Modern
Humanities Research Association (MHRA) style and the Council of Science Editors (CSE)
style (Neville, 2010). The Author-date (Harvard) style will be discussed in details but
students can read more on the other referencing styles.
Author-date (Harvard) Style
This style, when used in the text, cites the last, family name or surname of the author(s), or
organizational name and the year of publication in the text of the document being worked on.
A full list of all references in alphabetical order must be given at the end of the text. It is
however important to ensure that the name used in the in-text citation connects with the name
used to start the full reference entry at the end of the text.
E.g. In-Text Citation style
1. There would appear to have emerged by the end of the twentieth century two broad
approaches to the management of people within organizations (Handy 1996).
153

2. Handy (1996) argues that by the end of the twentieth century two broad approaches to
the management of people within organizations had emerged.
3. Some commentators, for example, Handy (1996), have argued that by the end of the
twentieth century two broad approaches to the management of people within
organizations had emerged.
4. It has been argued, (Handy 1996; see also Brown 1999 and Clark 2000), that two
approaches to the management of people within organizations had emerged by the
end of the twentieth century.
5. Charles Handy, amongst others, has argued that by the end of the twentieth century
two broad approaches to the management of people within organizations could be
observed (Handy 1996).
Full Reference Citation
1. Book Reference
AUTHOR(S) (Year) Title. Edition (if not the 1st). Place of publication: Publisher.
E.g.
o WILMORE, G.T.D. (2000). Alien plants of Yorkshire. Kendall: Yorkshire
Naturalists Union.
o LI, X. and CRANE, N.B. (1993) Electronic style: a guide to citing electronic
information. London: Meckler.
2. Books with one or more editor(s)
EDITOR(S) (ed./eds.) (Year) Title. Edition. Place of Publication: Publisher
E.g. SAUNDERS, M. (ed.) (1998). Advances in food science. Waterford: Nore Press.
3. Chapters in books
AUTHOR(S) (Year) Title of chapter. In: AUTHOR(S)/EDITOR(S), ed(s). Book title.
Edition. Place of publication: Publisher, Pages (use p. or pp.)
e.g. TUCKMAN, A. (1999) Labour, skills and training. In: LEVITT, R. et al, (eds.)
The reorganised National Health Service. 6th ed. Cheltenham: Stanley Thornes, pp.
135-155.
4. Publications from a corporate body (e.g. Government publications)
NAME OF ISSUING BODY (Year) Title. Place of publication: Publisher, Report no.
(where relevant), Pages, use p. or pp.
e.g. ENERGY COMMISSION OF GHANA (2006). Strategic National Energy Plan
2006 2020 and Ghana Energy Policy Main version. Ghana: Energy Commission.
5. Journal articles
154

AUTHOR(S) (Year) Title of article. Title of journal, Volume number. (Part


no./Issue/Month), Pages, use p. or pp.
RYAN, J. (2006) Management accounting for developers, Journal of advanced
accounting, Vol. 1, No 5: p.21-24

6. Papers in conference proceedings


AUTHOR(S) (Year) Title. In: EDITOR(S) Title of conference proceedings. Place and
date of conference (unless included in title). Place of publication: Publisher, Pages,
use p. or pp.
e.g. GIBSON, E.J. (1977) The performance concept in building. In: Proceedings of
the 7th CIB Triennial Congress, Edinburgh, September 1977. London: Construction
Research International, pp. 129-136.
7. Electronic sources
AUTHOR(S) (Year) Title of document [Type of resource, e.g. CD-ROM, e-mail,
www] Organization responsible (optional). Available from: web address [Date
accessed].
e.g. UNIVERSITY OF SHEFFIELD LIBRARY (2001) Citing electronic sources of
information
[WWW]
University
of
Shefeld.
Available
from:
http://www.shef.ac.uk/library/libdocs/hsl-dvc1.pdf [Accessed 23/02/07].
Numeric Referencing System
This system comprises mainly of two referencing styles, name the consecutive numbering
style and the recurrent numbering style.
Consecutive Numbering uses superscript numbers in the text that connect with references in
either footnotes or chapter endnotes (but usually the former). This system uses different and
consecutive number for each reference in the text. A list of sources is included at the end the
document, which lists all the works referred to in the notes (References, Works cited).
(Neville, 2010)
Recurrent numbering style uses bracketed (or superscript) numbers in the text that connect
with a list of references at the end of the chapter/assignment. In this case, the same number
can recur if a source is mentioned more than once in the text. (Neville, 2010)

155

5.3.4 Introduction to referencing software packages


Referencing Software also referred to as Bibliographic Management Software is designed
to help you store the references which you have located, and then cite those references in
an essay, paper, thesis or book which you are writing. It helps one to;

create a bibliography for a thesis, assignment or journal article in a preferred citation


style
Download and store references
Include abstracts, keywords and notes with the references and also full texts
Produce lists of references for yourself or others
Automatically insert citations of references while typing (Cite While You Write)
Create a bibliography while typing (Cite While You Write)

Examples of Referencing Software Packages


There are numerous referencing software packages but the commonest are endnote, endnote web
and reference manager.
Endnote
EndNote is a commercial reference management software package, used to manage
bibliographies and references when writing essays and articles. EndNote is probably the most
sophisticated referencing product available today and can perform a wide range of referencing
tasks. There are extensive possibilities for the advanced user to customize the software to
individual needs. It can also be used to do the following:

create a personal database or library of reference information


Download and organise references and associated images and PDF files
To insert citations, figures in documents and create bibliographies
To import bibliographic data from external databases and library catalogues

EndNote is compatible with recent versions of Microsoft Word (Windows and Macintosh) and
installs an add-in for easy integration with your word processing software. It is used most
effectively from the start of a project, when information is being resourced, rather than when
writing up begins.
Endnote Web

156

EndNote Web is a simplified version of the full desktop EndNote product. It has only recently
been released and is still under development, but it can perform many common referencing tasks.
EndNote Web is compatible with recent versions of Microsoft Word (Windows and Macintosh).
One must download and install a plug-in to enable EndNote Web to work with Word. Once
registered for Endnote Web one can:

Format citations and footnotes or a bibliography


Use Cite While You Write in Microsoft Word to easily cite references in your paper.
Download the Cite While you Write plug-in for Word from the EndNote Web site.
Transfer references to and from EndNote on your desktop
Share references with others who have EndNote Web

Reference Manager
Reference Manager is most commonly used by people who want to share a central database of
references and need to have multiple users adding and editing records at the same time. You can
specify whether users are allowed to make edits to the database. Reference Manager offers
different in-text citation templates for each Reference Type. It is however limited to Windows
operating systems only. Use Reference Manager to:

Create a personal database of reference information


Insert citations and create bibliographies
Import bibliographic data from external databases and library catalogues

Reference Manager is used most effectively from the start of a project, when information is
being gathered, rather than when writing up begins.
Further details about the features of Reference Manager are available on the Reference
Manager website along with an online overview of the new features of Reference Manager 12

Self Assessment 5.3

157

Learning Track Activities

Unit Summary
Communicating research findings to interested stakeholders is very important
since research works are usually carried out to address a specific issue in the
society. Journal articles and conference papers are among the commonest ways of
communicating research findings to stakeholders.
Thesis/research papers are a more detailed way of disseminating research findings
with the former directed more towards academics or scholars. It is also very
important for proper referencing to be done when putting together these
documents in order to avoid plagiarism.

Key terms/ New Words in Unit


Abstracts
Bibliography
Endnote
Journal articles
Referencing
Summaries

158

Unit Assignments 5

COURSE SUMMARY
The course is organised under five units. Introduction to research proposal writing and thesis
synopsis development is treated in unit1 while engineering research design and data analysis is
treated in unit 2. Unit 3 looks at social science research design and data analysis with unit for
concentrating on statistical analysis using SPSS and STATA. Finally, unit 5 introduces the
concept of journal article/conference paper writing and thesis report preparation.
Unit 1 sought to introduce students to the preliminary stages of research which involves the
preparation concept notes, which gives a brief idea about the nature of the research. It also
tackled the preparation of a full research proposal where it also looked at the logical framework
analysis as well as detailed budget preparation. The unit ended with an introduction to thesis
synopsis writing.
Unit 2 dealt with the rudiments of engineering research design and data analysis where issues
such as the various contexts in engineering practice which necessitate research, classification of
experiments that may be undertaken as part of the research and procedures for the design of
experiments. It went on to treat error theory and the various sources of research errors. The unit
also treated the concept of probability theory.
Unit 3 talked about social science research design and data analysis where it looked at the
various research methodologies including survey research as well as case study research. The
unit also treated some basic research ethics including balancing cost and benefits in research.

Unit 4 introduces statistical analysis software packages and their importance in increasing the
accuracy and speed of analysing, especially,

sophisticated data. It went on to indicate that,

planning and good policy can only be done more accurately, if the requisite data analysis is done
159

and done correctly. SPSS and STATA are some of the common statistical analysis software
packages that could be used in statistical analysis of data, such as, the census data.

Unit 5 put together all the works done during the research into a document for dissemination.
This introduced the concept of journal articles/conference paper writing, research report/thesis
writing and abstracts/summaries. The unit ended with a brief discussion of the various
referencing styles and a more elaborate explanation of the Harvard way of referencing.

APPENDIX A1
KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY

THESIS SYNOPSIS

NAME:

EMMANUEL YEBOAH OSEI

INDEX NO:

PG2678108

PROGRAMME:

MSC MECHANICAL ENGINEERING (THERMOFLUIDS)

DEPARTMENT:

MECHANICAL ENGINEERING

FACULTY:

AGRICULTURAL AND MECHANICAL ENGINEERING

DURATION:

TWO YEARS (FULL TIME)

TITLE:

FEASIBILITY STUDY FOR A WIND POWER GENERATION


PROJECT IN GHANA

SUPERVISOR:

PROF. ABEEKU BREW-HAMMOND


160


E. Y. OSEI
(CANDIDATE)

DR. A. K. SUNNU
(HEAD OF DEPARTMENT)

.
PROF. BREW-HAMMOND
(SUPERVISOR)

August 2009

BACKGROUND
Modern life would come to a halt without energy and this makes it simply impossible to
live without it. Studies have shown that simply harnessing the power of oxen in ancient times for
example increased the power available to the human being by a factor of 10 (World Energy
Council, 2000). The invention of the vertical water wheel increased human productivity by
another factor of 6 (WEC et al., 2000). The use of motor vehicles and airplanes have drastically
reduced journey times and increased the ability of humans to transport goods over wider
distances. Energy being the foundation for industrial civilization coupled together with the
depleting conventional fossil sources has made it necessary for the world to seek alternative
sources to meet the increasing demand.
Renewable energy sources are becoming increasingly attractive due to the limited fossil
reserves and the adverse effects associated with their use. They have the potential to provide
energy with zero or almost zero emissions of greenhouse gases and other air pollutants. The
renewable energy sources including solar, wave, wind, hydro, tidal, geothermal and bio-energy
are readily available and can provide complete energy security if their technologies are well
established (REN21, 2008).
Wind energy, first used by the Egyptians around the 4th century BC is a promising source
of electrical power because it has key advantages such as cleanliness, low cost, sustainability,
161

popularity, safety and abundance in most parts of the world. Studies in Ghana indicate that the
monthly average wind speed measurement at 12 m height above ground level lies in the range of
4.8 5.5 m/s (Akuffo et al., 2003). For wind speed of less than or equal to 4.4 m/s at a height of
10 m, the wind power density is less than or equal to 100 W/m2 according to Li and Li (2005).
Despite this potential, the electrification rate in Ghana is 49.2% and 11.3 million people are
without electricity (IEA, 2006). The productivity of this large number of people is seriously
compromised and this constrains their opportunities for economic development and improved
living standards. This project seeks to assess the technical performance and determine the cost of
building a 50 MW wind power plant in Ghana.

JUSTIFICATION
The need to ensure electricity supply security first came to light in the 1980s when
Ghana suffered a major drought resulting in reduced inflows to the Akosombo Dam. This
disrupted electricity supplies and adversely affected the performance of the economy. Today,
Ghana faces the challenge of providing reliable energy for the rapidly growing demand by all
sectors due to the expanding economy and growing population. It has been estimated that grid
electricity demand would grow from about 6,900 GWh in 2000 to about 18,000 GWh by 2015,
reaching about 24,000 GWh by 2020 (Energy Commission, 2006). The existing installed
electricity generating capacity of 1760 MW would have to be doubled by the year 2020 if Ghana
is to be assured of secured uninterrupted electricity supply (Energy Commission, 2006). To
become wealthy as a country, Ghana needs to grow at a GDP between 8 10% and these growth
rates require significant amount of electricity (Brew-Hammond et al., 2007).
Wind power use and development worldwide is growing rapidly, having doubled in the
three years between 2005 and 2008. The global wind industry installed close to 20,000 MW of
new capacity in 2007. This development, led by Spain, China and United States took the
worldwide total to 93,864 MW which was an increase of 31% compared with the 2006 market
and represented an overall increase in global installed capacity of about 27% (GWEC et al.,
162

2008). In 2008, it accounted for 19% of the electricity production in Denmark, 10% in Spain and
Portugal and 7% in Germany and the Republic of Ireland. At the end of that same year, the
worldwide nameplate capacity of wind-powered generators was 120.8 GW (Wikipedia, 2009).
These success stories attest to the efficacy of wind power technology as a viable option in
providing energy and reducing environmental pollution.
The installation of 50 MW wind power plant in Ghana is to augment the existing sources
of electricity in the country which are mainly from thermal and hydro sources. This will to some
extent contribute positively to the aggravating energy situation in the country. Wind energy
being a renewable source has the ability to provide energy in a sustainable manner and with
virtually zero emission of pollutants and greenhouse gases.
The Energy Commission of Ghana in 2003 conducted a study to gather and analyze wind
energy data in some areas of the country (Akuffo et al., 2003). This data would help determine
the wind turbine technology to use and the estimate of the cost required for installation.
OBJECTIVES
The main objective of this thesis is to conduct a feasibility study of generating 50 MW from
wind energy in the coastal areas of Ghana.
The specific objectives are listed as follows:
1. To collate up-to-date wind measurements for Ghanas coastal belt.
2. To select the area best suited for wind power development along the coastal belt of
Ghana based on the collated data.
3. To select wind turbine technology in the 50 MW range best suited for the selected area.
4. To undertake the technical performance assessment and greenhouse gas emissions
analysis for a 50 MW wind power plant using the selected technology at the selected
area.
5. To do a financial analysis of building the selected 50 MW wind power plant.

163

METHODOLOGY
Literature would be sought in order to get acquainted with the relevant works that have
been done in the field of wind power. The areas of interest would include various wind flow
velocities in the world and particularly in Ghana, energy situation in the country, standard
relationships between wind speed and estimated power that can be generated per squared meter,
the relevance of wind power in the country and wind turbine design technologies. Sources of
information will include the KNUST library, internet, etc.
Prefeasibility study of a 50 MW wind power plant would be done using RETscreen with
in-built data and turbine specifications. The total initial cost will be determined as well as the
simple pay back period. Green house gas analysis will also be done.
Areas of the country best suited for wind power development will be selected based on
the recommendations of Solar and Wind Energy Resource Assessment compiled by the Energy
Commission of Ghana in 2003 and more recent data to be collected from them. The help of the
Ministry of Energy will be sought to approach private companies who have also made their own
measurements for coastal areas with the view to acquiring their data sets to be included with
those of the Energy Commission.
Wind turbine design technologies and their technical performance characteristics plus
their costs would be collected from the manufactures, reviewed and the best ones suited for the
countrys situation determined. The comparison criteria will include merits and demerits,
technical considerations, applicability to the Ghanaian situation, etc. The technical assessment of
the whole plant will be carried out with Wind Atlas Analysis and Application Program (WAsP)
designed by Ris National Laboratory.
The cost of building a 50 MW wind power plant in the areas of interest would again be
determined using Computer Model for Feasibility Analysis and Reporting (COMFAR) software
package designed by UNIDO for feasibility studies.

164

WORK PLAN
TIMELINES FOR THE COMPLETION OF THESIS
2009
MONTHS

MAR

APR

MAY

JUN

JUL

AUG

SEP

OCT

WEEKS

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2

Synopsis
Literature
Review
Thesis Writing
Prefeasibility
Study
Technical
Assessment
Financial
Analysis
Thesis wrap up
Submission of
Draft Thesis

BUDGET FOR COMPLETION OF THESIS


EXPENSES

TOTAL COST GH

Stipend (GH 400 per month for 10 months)

4,000

Printing of Draft Thesis

150

Printing of Final Thesis

150

165

NOV

Total

4,300

REFERENCES
Akuffo F. O., Brew-Hammond A., Antonio J., Forson F., Edwin I. A., Sunnu A., Akwensivie F.,
Agbeko K. E., Ofori D. D., Appiah F. K. (2003). Solar and Wind Energy Resource Assessment
(SWERA). Department of Mechanical Engineering, KNUST.
Brew-Hammond A., Kemausuor F., Akuffo F. O., Akaba S., Braimah I., Edjekumhene I.,
Essandoh E., King R., Mensah-Kutin R., Momade F., Ofosu-Ahenkorah A. K., Sackey T. (2007).
Energy Crisis in Ghana: Drought, Technology or Policy? Kwame Nkrumah University of
Science and Technology, Kumasi, Ghana. ISBN: 9988-8377-2-0.

Energy Commission of Ghana (2003). Solar and Wind Energy Resource Assessment (SWERA).
Department of Mechanical Engineering, KNUST.

Energy Commission of Ghana (2006). Strategic National Energy Plan 2006 2020 and Ghana
Energy Policy. Main version.

Global Wind Energy Council, Greenpeace, Wind Power Works (2008). Global Wind Energy
Outlook 2008.

International Energy Agency (2006). World Energy Outlook. OECD/IEA, Paris.

Meishen Li, Xianguo Li (2005). Investigation of wind characteristics and assessment of wind
energy potential for Waterloo region, Canada. Department of Mechanical Engineering,
University of Waterloo, 200 University Avenue West, Waterloo, Ont., Canada, N2L 3G1.

REN21 (2008). Renewables 2007 Global Status Report. Paris: REN21 Secretariat and
Washington, DC: Worldwatch Institute.

Resource Center for Energy Economics and Regulation (2005). Guide to Electric Power in
Ghana First Edition. Institute of Statistical, Social and Economic Research, University of
Ghana, Legon.
166

Wikipedia (2009). Wind Power. http//en.wikipedia.org/wiki/Wind_energy (assessed: 23 March


2009).

World Energy Council, United Nations Development Programme, United Nations Department of
Economic and Social Affairs (2000). World Energy Assessment: energy and the challenge of
sustainability. New York, NY 10017. ISBN: 92-1-126126-0.

APPENDIX A2
KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY
THESIS SYNOPSIS

167

NAME:

FAISAL WAHIB ADAM

INDEX NO:

PG3759209

PROGRAMME:

M.Sc. MECHANICAL ENGINEERING

DEPARTMENT:

MECHANICAL ENGINEERING

FACULTY:

AGRICULTURAL AND MECHANICAL ENGINEERING

DURATION OF PROGRAMME:
TITLE:

TWO YEARS (FULL TIME)

A STUDY ON THE FAILURE OF THE FEMORAL SHAFT IMPLANTS


(A CASE STUDY OF THE KOMFO ANOKYE TEACHING HOSPITTAL, KATH)

SUPERVISORS:

DR.JOSHUA AMPOFO (MECHANICAL ENGINEERING, KNUST)


DR.R.KUMA AMATEPEY (TRAUMA AND OTHOPEDIC
DEPARTMENT, KATH)

SIGNATURES:
..

FAISAL WAHIB ADAM


(Student)

.........................

DR. A. K. SUNNU
(Head of Department)

DR.JOSHOUA AMPOFO
(Supervisor)

BACKGROUND
The femur or thigh bone, is the strongest, longest, and heaviest bone in the body and is essential for
normal ambulation. It consists of three parts; femoral shaft or diaphysis, proximal metaphysis, distal
metaphysic(Douglas et al., 2008).

168

Figure. 1(Wikipedia,2010)

A femoral shaft fracture is a severe injury that generally occurs in high-speed motor vehicle collisions
and significant falls. These injuries are often one of the several major injuries experienced by patients
(Jonathan, 2005). This type of fractures like may other bony fractures has become more common in
Ghana due the exponential increase in the number of motor vehicle accident.
The occurrence of fractures of the femoral shaft in the United States is reported in the bimodal
distribution and it peaks at 25 and 65 years of age with an overall incidence of approximately 1 per
10,000 people per year. Motor vehicle accident is the most common cause, followed by pedestrian
versus automobile, falls from height, and gunshot injuries (Jesse, 2008).A similar studies done by Hinton
et al., 2000, reveal that the rate of femoral shaft fractures in children in Maryland was 19.5 per 100,000
per year, the same as the overall incidence in Finland. The most commonly occurring fracture in children
aged 6 to 9 years was caused when they were struck by cars. Once children reached driving age, the
most frequent cause was a motor vehicle accident. This variation gave rise to a bimodal distribution with
peaks at 2 and 17 years
169

In the Department of Orthopaedic Surgery and Traumatology, Obafemi Awolowo University Teaching
Hospital, Ile-Ife, Osun State, Nigeria, a study of fractures reported indicates that the distribution of the
involved bones included being humerus 10%, femoral shaft 65%, and tibia 25% (Innocent et al., 2006)
Nowadays femoral shaft fractures in adults are usually treated operatively. With more and more of
femoral shaft fractures getting operated the number of complications has proportionately increased.
One such complication is implant failure. An implant is said to have failed if it is found to be inadequate
in performing the function expected of it.
The study of the causes of this failure for engineering purposes requires quantitation of many factors,
most of which the surgeon is aware but cannot access quantitatively the requirements of a particular
situation as an engineer does. This is why an engineering analysis needs to be done to find these causes.

JUSTIFICATION
A discussion with a section of orthopedic doctors and nurses at the orthopedic department (KATH)Kumasi-Ghana, has revealed that there is an alarming rate of femoral shaft implant failures, and this
calls for an objective assessment of the exact circumstances that lead to implant failure, as it is
necessary to prevent this complication in one of the major weight bearing bones of the body.
Failure of an implant is a condition that needs to be completely avoided in the human body, because of
the devastating complications that it can bring, for instance a bend in the implant gradually removes the
thin film of oxide on its surface and hastens the corrosive process, the metal if not removed continually
sheds so that the surrounding soft tissue slowly become saturated with metal particles, which may lead
to aseptic inflammation many years after implantation(Charles et al., 1959).Another complication is
shortening of femur, and this leaves the patient with torsion on the pelvic girdle.
The causes of implant is a complex one to look at, because, it involves the engineer(designer),the
surgeon, Operating-room personnel and the patient, all these people have a potential contribution to
failures as well as to successes of the implant. From the standpoint of Mechanical Engineering, every
device has points of weakness at which it will fail when the margin of safety is exceeded. It is the
designer's responsibility to provide an adequate minimum margin, and it is the surgeon's not to exceed
that margin (Cohen, 1964).
A lot of work has been done on the failure of femoral shaft implants in many countries, but to my
knowledge the causes of the failure of femoral shaft implants in operative orthopaedic practice has not
been reported in the Komfo Anokye Teaching Hospital-Ghana. In this background it is decided to study
the causes of the implant failure of the shaft of the femur, from the Mechanical Engineering point of
view, by testing the mechanical properties of the implant, to obtain the allowable stress in order to
compare it with the stresses acting on the implant, so as to suggest guidelines to minimize further
failures.
170

OBJECTIVES
The objective of this work is to find the causes of failure of the femoral shaft implants at the Komfo
Anokye Teaching Hospital (KATH)
The main objectives will be;
To find the mechanical properties of the femoral shaft plate implant that is used
-the tensile strength
-modulus of elasticity
To find out the material composition of this implant
To find out the possible forces that could act on the bone and plate assembly
To find the possible causes of failure so as to suggest guidelines to minimise further failures

METHODOLOGY
Two cases of healed femoral shaft implants and three failed ones, who presented at the department of
Orthopaedics KATH- Kumasi-Ghana, will be studied under the following headings;
Age
Sex
Body weight
Nature of primary injury
Anatomical site of the fracture
Type of primary fixation
Weight bearing
The X-ray of the fracture site will be taken together with the removed implant. The implant will be taken
to the mechanical engineering laboratory for the tensile test to be done. The x-ray will aid in the
computer modeling, to predict the forces that could have cause that kind of failure using the ANSYS
software to do a progressive failure analysis.
Exclusion criteria
Infected implant failure and implant failures in pathological fractures of femur.

171

FACILITIES AVAILABLE

KNUST Library Kumasi


The Universal Testing Machine (Mech.Dept. KNUST)
A and E Theatre(KATH) Kumasi
World Wide Web (Internet)
The Metallurgy Laboratory (Mech.Dept. KNUST)
ANSYS Software

REFERENCES
1. Jonathan Cohen, Failure in Performance of Surgical Implants, Journal of Bone and Joint Surgery
http://www.jbjs.org. (Accessed 2010 February 6)
2. Charles Orville Bechtol,Albert Barrnett Ferguson,Patick
Engineering in Bone and Joint Surgery

Gowans Laing(1959),Metal and

172

3Alfred O. Ogbemudia,Phillip F.A.Umebee (2006).Implant Failure in Osteosynthesis of Fractures of Long


Bones. Journal of Medicine and Biomedical Research (College of Medical Sciences, University of Benin
Nigeria)
4. C.R.F.Azevedo,E.Hippert Jr. (2002).Failure Analysis of Surgical Implants in Brazil
5. Jesse T.Torbert.Femoral Shaft Fractures. http://www.orthopaedic.com . (Accessed 2010 March 11).

6. Douglas F.Aukerman,John R.Deitch,Janos P.Ertl,William Ertl.(2008). Femur Injuries and Fractures.


7. Richard Y.Hinton,Andrew Lincoln,Gordon Smith.Fracture of the Femoral Shaft in
Children.Incidence,Mechanism and Sociodemographic Risk Factors, Journal of Bone and Joint Surgery
http://www.ejbjs.org, . (Accessed 2010 March 18).
8. Rozbruch,Roberts S;M ller,Urs;Gautier,Emanuel;Gans,Reinhold. The evolution of femoral shaft
plating technique.http://journals.lww.com. (Accessed 2010 March 4).
9.Feres S.Haddad,Clive P.Duncan,Daniel J.Berry,David G.Lewallen,Allan E.Gross,Hugh P.Chandler.
Periprosthetic Femoral Fractures Rround Well-fixed Implant;Use of Cortical Onlay Allografts with or
without a plate. Journal of Bone and Joint Surgery http://www.ejbjs.org,
( accessed 2010 March 4).
10. RJ Brumback,S Uwagie-Ero,RP Lakatos ,A Poka ,GH Bathon and AR Burgess(1988). Intramedullary
Nailing of Femoral Shaft Fractures.Part II;Fracture Healing with Static Interlocking Fixation. Journal of
Bone and Joint Surge.
11. RW Buchoz ,SE Ross,and KL Lawrence(1987).Fatique Fracture of the Interlocking Nail in the
Treatment of Fractures of the Distal part of the Femoral Shaft. Journal of Bone and Joint Surgery.

DETAIL BUDGET FOR COMPLETION OF THESIS


EXPENSES

Unit Cost (GH )

Period(month)

Total Cost (GH )

Visits to the hospital(KATH)

20

100

173

Printing and Binding of Thesis

200

Stipends

400

Books

500

500

Miscellaneous

100

100

Total

200
8

3 200

3 800

174

WORKPLAN FOR COMPLETION OF PROJECT


YEAR

2010

MONTHS

MARCH

WEEKS

APRIL
3

MAY
3

JUNE
3

JULY
3

AUGUST
3

SEPTEMBER
3

Literature
review,
Synopsis
Writing,
Sponsorship
Taking
samples from
hospital
Design of
experiment
Chapters one
and two
Testing of
samples
Computer
modeling
Chapter three
Analysis of
results
Chapters four
and five
Submission of
draft thesis

APPENDIX B

/*#########################################################################*/
/*
DO-FILES WRITTEN BY: FAISAL WAHIB ADAM
*/
/*
MECHANICAL ENGINEERING DEPARTMENT
*/
/*
KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY
*/
/*#########################################################################*/
*DATE: [05-03-2011]

175

OCTOBER
4

*****************************************************************************
//FIRST RESULT (2 tables)
use "C:/Documents and Settings/Administrator/Desktop/Stata
10.0/Faisal/finalgraphV3a.dta"
log using result1,replace text
describe
tabulate s7dq13 category1
tabulate s7dq13 category1,column nofreq
log close
exit
*****************************************************************************
///SECOND RESULT (1 graph)
use "C:/Documents and
Settings/Administrator/Desktop/Stata10.0/Faisal/finalgraphV1.dta"
log using result2,replace
describe
graph bar (count) hhid ,over( s7dq13) over(category1) asyvars percentages
stack title(" DISTRIBUTION OF COOKING FUELS") subtitle("FOR THE VARIOUS
INCOME QUINTILES IN GHANA")ytitle("Percentage of households")
note("Source:Fifth Ghana Living Standards Survey" )
legend(position(3) cols(1) order(8 7 6 5 4 3 2 1))
log close
exit
*****************************************************************************
///THIRD RESULT (1 table)
use "C:/Documents and Settings/Administrator/Desktop/Stata
10.0/Faisal/finalgraphV3b.dta"
log using result3,replace text
describe
tab region if s7dq13==7
log close
exit
*****************************************************************************
///FOURTH RESULT (1 graph)
use "C:/Documents and Settings/Administrator/Desktop/Stata
10.0/Faisal/finalgraphV3b.dta"
log using result4,replace
label
label
label
label
label
label
label
label
label
label
label

define
define
define
define
define
define
define
define
define
define
define

category1 1 "1", modify


category1 2 "2", modify
category1 3 "3", modify
category1 4 "4", modify
category1 5 "5", modify
region 1 "UE", modify
region 2 "Nortn", modify
region 3 "UW", modify
region 4 "BA", modify
region 5 "Volta", modify
region 6 "Centl", modify

176

label
label
label
label

define
define
define
define

region
region
region
region

7
5
6
7

"Eastn",
"Westn",
"Ashti",
"Accra",

modify
modify
modify
modify

describe
graph bar (count) hhid ,over( s7dq13) over(category1) over(region) asyvars
percentages stack ///
title(" DISTRIBUTION OF COOKING FUELS") subtitle("FOR THE VARIOUS INCOME
QUINTILES AND REGIONS IN GHANA") ///
ytitle("Percentage of households") note("Source:Fifth Ghana Living Standards
Survey" ) ///
legend(position(3) cols(1) order(8 7 6 5 4 3 2 1))
log close
exit
*****************************************************************************

Selected Answers to Unit Assignments


[Supply selected answers to Unit Assignments]

177

Course Quiz/Exams
[Supply course quiz of this course here for the attention of the Institutes examinations officer]

178

RESEARCH/PROJECT AREAS AND RELATED TOPICS


IN THIS COURSE
[Supply research/project areas and related topics in this course for use by students]

179

SOME CASE STUDIES IN THIS COURSE

180

MY PAGE
Name: _______________________________________ Learning Centre: _________________
Contact: Tel. ____________ Email: ________________ Emergency Name/Phone: __________
Important numbers: Student number ______________ Examination number ________________
Program: ___________ Year: _______ Course code/title: _______________________________
Course objectives: ______________________________________________________________
_____________________________________________________________________________
Course dates/Semester No ( ): Starts_____________________ Ends ____________________
FFFS schedule/Dates: ___________________________________________________________
Quiz dates: ____________________________________________________________________
Assignments hand in dates: _______________________________________________________
Revision dates: _________________________________________________________________
Group discussion/work members (names and contacts): _________________________________
______________________________________________________________________________
______________________________________________________________________________
End of course Self Evaluation:
I have completed all Units & interactive sessions , mastered all learning objectives, completed
all self Assessments, unit summary, key words and terms, discussion questions, review
questions, reading activity, web activity, unit assignments, and submitted all CA scoring
assignments, learner feedback on this course and submitted my comments and course focus
contributory questions to facilitator for discussion.
Self-grading: self assessment questions score ______ %

Unit Assignments scored ______ %

My course conclusion remarks: ____________________________________________________


______________________________________________________________________________
______________________________________________________________________________
181

____________________________________________________ (may continue on reverse side)

182

= = = = = = = = = = = = = = = = = = = = = = = = = detach and return to IDL, KNUST = = = = = = = = = = = = = = = = = = = =

Learner Feedback Form/[insert course code]


Dear Learner,
While studying the units in the course, you may have found certain portions of the text
difficult to comprehend. We wish to know your difficulties and suggestions, in order
to improve the course. Therefore, we request you to fill out and send the following
questionnaire, which pertains to this course. If you find the space provided
insufficient, kindly use a separate sheet.
1. How many hours did you need for studying the units
Unit no.
1
2
3
4
No. of hours

2. Please give your reactions to the following items based on your reading of the
course
Items
Excellent Very Good Poor Give specific examples, if
poor
good
Presentation

quality
Language and

style
Illustrations

used
(diagrams,
tables, etc.)
Conceptual

clarity
Self assessment

Feedback to SA

3. Any other comments (may continue on reverse side)


Unit 1: _______________________________________________________________
Unit 2: _______________________________________________________________
Unit 3: _______________________________________________________________
Unit 4: _______________________________________________________________
Unit 5: _______________________________________________________________
Unit 6: _______________________________________________________________
183

184

You might also like