You are on page 1of 63

COM 5312 RESEARCH METHODS IN COMMUNICATION

Course Notes by Professor C. Edward Wotring The Florida State University College of Communication With compilation assistance from: Kevin Hayworth, Paul Pilger, and Betsy Radtke 1997 C. Edward Wotring

Section I. Philosophy, Knowledge, and the Nature of "Knowing" Section II. Communication Among Researchers:Professional Associations, Conferences and Journals Section III. Defining Your Interests: Scientific Definitions Section IV. Exploring Ways of Understanding Your Interests: Propositions, Theory, and Hypotheses Section V. Studying Your Interests: Data Gathering Section VI. Your Results: How To Analyze, Interpret, and Report Your
Observations

Section I.
Philosophy, Knowledge, and the Nature of "Knowing" There are no right or wrong decisions in research, only more or less defensible ones. Introduction The purpose of this section is to describe some basic viewpoints concerning the process of inquiry into communication phenomena. It addresses the common scientific/positivistic approach as well as the interpretive and critical philosophies and methodologies -- the so called "alternative paradigms". These alternative approaches are re-emerging in communication with some vigor, and there have been continuing

debates over the utility, validity and suitability of the so-called "mainstream" positivist/ objectivist/scientific paradigm vs. these alternative interpretivist and critical/cultural paradigms . My hope is that these notes, textbook readings and class discussion will give you a basic understanding of the various ways of studying communication. We will be addressing these paradigms for the entire semester. It is my contention that there is not, nor will there be, a definitive answer to the question, "Which of these paradigms is the best for the study of communication?" I think that the different approaches are more or less appropriate for different communication questions and settings. Moreover, the pace of change is everquickening in the world and in the communication discipline; in these circumstances a scholar should not grasp any paradigm too firmly. Indeed, the mark of scholarship is an understanding of the various paradigms and a willingness to admit the follies of each and the frailties of science in general. Science/Scientific method
y y y y y

A tool used to explain and predict phenomena Rather than discovery, science creates, or invents, explanations for observed phenomena The explanations that scientists create are called "theories" Science is interested in developing theories about ordered, reoccurring, observable phenomena in nature The purpose of science is to create theories which explain and predict patterned and repetitious phenomena through objective and replicable observation

Criteria for judging products of the scientific method There are three major criteria used in judging the products of the scientific method:
y y y

Utility: If a product of the scientific method can explain and predict, and is replicable, it has utility Symmetry: If a product of the scientific method makes logical sense it has symmetry Parsimony: also called "efficiency." That theoretic explanation or model that explains the phenomena of interest with the fewest assumptions and simplest structure is the most valued. Explaining the most with the least. Occam's razor.

** If a product of the scientific method has both symmetry and parsimony it has elegance, or logical efficiency .

No Research is Value-Free: Researchers create their own values through their own experiences and conventions.
y y y y

Values cannot be removed from the researchers' observations Any observer brings a frame of reference to his or her observations which has a direct influence on those observations Scientists use theory as a frame of reference for their observations Theory is a construction that researchers use to interpret the facts they observe

Ways of "Knowing" There are five basic ways of knowing, or of gaining knowledge
y y y y y

Tenacity: We continue to do a certain thing a certain way because it has always been done in that particular way, tradition Authority: Authority figures, or experts, make assertions which we accept because we trust them and the statements they make A Priori: We believe certain things and accept them as true because they are self-evident; intuition; rationalism Magic: Supernatural intervention Science: Based on empiricism, science is descriptive, not prescriptive. We must rely on the first four ways of knowing as well as science, to gain knowledge about the world

Scientific Knowledge Is Not Necessarily Truth There are four fundamental reasons (and many other reasons) that the empirical approach of science does not equate with the Truth: Sensory Data: 1. We rely on sensory data (sight, smell, touch, hearing, tasting) to provide us with an accurate reflection of reality, but we can never be sure that the reflection is accurate because we cannot be sure that our senses are reliable (Descartes' Devil) 2. Logical Positivism: This is (was) the major philosophy (epistemology) underlying the scientific method in physics, chemistry, astronomy, psychology, etc.. It was developed in the 1920's by a group of mathematicians, philosophers and scientists called the Vienna Circle. It states that meaningful statements about the world are only those that can be proven to be true or false through observation (or logic in the case of mathematical statements, e.g. 2+2=4). However, particularly in the social sciences, but also in the physical sciences we violate these rules. Three cases in point: A. We seek Universal Laws or Truths; these can never be proven true because it would take an infinite number of observations to do so. B. We study hypothetical constructs such

as attitudes, beliefs, values which are not observable. C. A mathematician named Godel showed that all logical systems are flawed -- they cannot be both consistent and complete. Since we break our own philosophical rules, we cannot claim that our method gets us to the absolute truth let alone that it is the only method to understand reality. 3. Life Span of Scientific Theory: All scientific theories change over time and sooner or later are replaced by better ones. Scientific theories and the nature of science are constantly changing, and if science is in constant flux, we cannot say with certainty that its answers are absolute. Someone once said that what we know is whatever theories are in vogue at any given time. 4. Observability: Science requires that a theory be tested through observable evidence. Particularly in the case of historical phenomena, evidence may have been destroyed or otherwise unavailable. Explanations without available evidence are discounted even though they might be otherwise tenable. The debate about the fate of dinosaurs is a case in point.
Humanistic/Interpretivist Approach
o o o o o o o o

Qualitative Interpretivist Teleological: humans seek to fulfill their needs Humans act of their own volition, choose their own behavior Humans interact with their environment, not react to it Researchers look for rules that govern behavior as opposed to laws that determine behavior Instead of seeking to explain what is "out there"in the world, humanistic scholars are interested in how humans interpret what's "out there" "Johnny ran down the hill because he wanted to" as opposed to the S-R explanation that he did so because of reward history.

Critical Approach
o o o o o

Conflict theory; continental philosophy; cultural studies; Marxism Relies on proactive research, value laden Attempts to persuade society to change for "the better" Focuses on class struggles and patterns of domination; haves vs. havenots; Gender studies; post-modernism; de-constructionism, reconstructionism

Scientific/Positivistic Approach
o o o o o o

Quantitative Relies on behaviorism: humans react to stimuli Cause and effect relationships between humans and environment (e.g., Pavlov's Dog and the "ding-drool" model) Researchers seek Laws of behavior Observe, predict and explain what is "out there"in the world "Johnny ran down the hill because he has been rewarded in the past for running down the hill."

Empiricism
o o o o o

Empiricism is the gaining of knowledge through experience Empiricism can be equated with the idea that "seeing is believing" Basis of empiricism is observation A phenomena must be both publicly observable and the research to study it must be replicable to fit within the realm of empirical knowledge Empiricism is the basis of qualitative and quantitative methods, and spans the three approaches: positivistic, interpretivistic and critical

Qualitative vs. Quantitative

Qualitative Analysis:
o

Involves the examination and interpretation of observations for the purpose of discovering underlying meanings and patterns of relationships Does not allow for the same level of objectivity and generalizability of quantitative approaches, but provides much richer and in-depth data which often provide insights into subtle nuances of behavior that quantitative approaches might miss. Very useful for exploratory research and in the early stages of theory development

Quantitative Analysis:
o

Involves the numerical representation and manipulation of observations for the purpose of describing and explaining the phenomena that those observations reflect. It is argued that quantification allows for more precision in analysis and ease in summarizing data and making inferences

o o

Attempts to be very objective, controlled and value-free It can lack the depth of some qualitative approaches

These approaches are not necessarily mutually exclusive! It is possible and in many cases preferable to combine qualitative and quantitative methods in the same study. A researcher should not "believe" in any particular methodology. The method(s) selected to use in a study should be those most appropriate to the questions being asked. The Scientific, Humanistic, and Critical Paradigms: A Second Look by Daryl Wiesman and C. Edward Wotring Three broad paradigms, perspectives or approaches that have been used to study communication phenomena are the scientific, the humanistic, and the critical. These perspectives help the researcher in deciding what questions to study and/or hypotheses to test, how to study these (i.e., the appropriate methods to use), and how to interpret the results. While the paradigms are quite different in some respects, they also share similarities -- 1) they all seek an understanding of communication that is helpful and useful; i.e., all share the assumption that knowledge is valuable ; 2) they also share the assumption of materialism which suggests a real world exists outside of our perceptions of it (as you will see below, the scientific scholar is more interested in studing that real world, while humanistic and critical scholars are more interested in our interpretations of that real world); and finally 3) they all rely on the basic assumptions of empiricism which relies on sensory experience as a means of knowing the world (evidence). There are some radical scholars representing each of these paradigms that see the approaches as conflicting/oppositional. Other scholars, including us, see them as complimentary, and more importantly, quite useful depending on which questions you choose to pursue as a researcher. What follows is a brief overview of these three paradigms and the philosophical assumptions that underlie each. What are described are "ideal types"; in practice you will see that theories can fall in-between paradigms. The distinctions at times become blurry. Paradigms Scientific/Positivistic Paradigm

The scientific/positivistic paradigm in its ideal considers reality as an object to be studied outside of the self. The researcher is separated from that being studied; the researcher remains distant. Creswell (1994) writes, " Thus in surveys and experiments, researchers attempt to control for bias, select a systematic sample, and be 'objective' in assessing a situation." The scientific approach tends to use quantitative methods, although there are many instances in which more qualitative techniques are appropriate and are integrated into the research. The scientific paradigm studies the world "out there." It attempts to describe, explain and predict phenomena in the "real world" through objective observation. Called the positivist, main steam, and traditional paradigm, the scientific model seeks to "discover" the laws of nature. The scientific paradigm is a deductive process, going from generalizations (theory) leading to prediction (hypotheses), explanation, and understanding. Replications of a true scientific study should yield the same results. If you want to purchase the best laundry detergent, objective tests such as those provided by Consumer's Reports are probably better evidence than that provided by Madison Avenue advertisers. You want objective data, not spokesperson opinions. The Humanistic/Interpretivist Paradigm The humanistic paradigm relies on human interpretations of reality as the basis of understanding the world. Rather than focusing on the objective reality "out there", humanistic scholars are more interested in how people interpret that reality "in here" (in their conscious minds) and how they act based on those interpretations. Regarding this paradigm, Creswell (1994) maintains that, "Researchers interact with those they study, whether this interaction assumes the form of living with or observing informants over a prolonged period of time, or actual collaboration." The distance between the researcher and that being studied is minimized. It is difficult to separate the observer from the observed. The humanistic/interpretive scholar tends to use qualitative methods, but here, again, sometimes quantitative techniques are appropriate and useful. Instead of seeking laws of nature, humanistic scholars look for 1) rules which human groups develop to govern and guide their behavior; and 2) how these rules are developed and evolve. The research process is inductive in which observations are made and then generalizations are developed based on those observations. Also known as the "alternative" paradigm, the humanistic approach enables researchers to understand how individuals make sense of their own world. Some examples of humanistic research include critical analysis of texts, ethnographies, anthropological research, and historical studies. Because different people interpret the world in different ways, humanistic scholars allow for "multiple realities" and embrace a "constructivist" view of scholarship.

Therefore, to understand the humanistic scholar's findings, you need to know about that scholar's viewpoints and biases (those are usually provided as a part of the research report). If you want to know who gave the "best" presidential nomination speech at the republican and democratic conventions, you probably want the opinion of a rhetorical critic; if you want to know about the quality of recent films, you might read reviews by film critics whom you like/trust. Objective/scientific data is less useful here. The Critical Paradigm The critical paradigm is similar to the humanistic/interpretive paradigm, but goes a step beyond it in that it is very value-oriented. This paradigm takes a critical look at society and tries to identify inequities as well as ways to remedy them. Some critical scholars are quite proactive in that they use their research and influence in an attempt to change society for what they believe is the better. This paradigm is concerned with society and the hidden power structures that permeate it. The critical researcher makes judgments about what is right and wrong about society and determines what can be done to improve the society. Marxism, Feminist studies, cultural studies, deconstructionism/ reconstructionism are examples of the critical paradigm. If you are interested in how portrayals of minorities on television have changed, if they are fair and accurate, if such actors/actresses are paid the same and get the chance at "good" roles, then critical research may be appropriate. Equal accessibility to new communication technologies is another issue of interest to critical scholars. Philosophical Assumptions "Philosophy," maintains Littlejohn (1996), "questions the basic assumptions and methods of proof used in generating knowledge in all walks of life." Philosophical assumptions, though complex, can be grouped into three major categories: ontology, epistemology, and axiology. These assumptions are not mutually exclusive as your will see. Furthermore, different theories don't cleanly fit on one side or the other, but more along a continuum. The three paradigms discussed above differ somewhat in how they use each of these three assumptions. Below is a brief discussion of these key assumptions. Ontology

Ontology deals with the nature of being. In the social sciences, ontology deals largely with the nature of human existence. In discussing ontology, four important issues arise (Littlejohn, 1996): To what extent do humans make real choices? To what extent are humans best understood in terms of states vs. traits? To what extent is human experience individual vs. social? To what extent is communication contextualized? For our purposes, ontology can be grouped into two opposing assumptions, Actional and Nonactional. Nonactional assumes that behavior is determined by outside causes and is therefore responsive to biology and the environment. Individuals do not make real choices but rather "respond" to "stimuli" and reward-punishment contingencies. Littlejohn (1996) continues, "Laws are usually viewed as appropriate in this tradition; active interpretation by the individual is downplayed." The mass-media effects research tradition is a good example of this assumption. Violent portrayals in movies and television are believed to cause aggressive behavior tendencies in children. Stimulus-Response; CauseEffect. Behaviorism in psychology and communication are other examples of this ontological view. Actional assumes that people make real choices. People have intentions and act upon those intentions. Individuals create meaning and exercise free will. Littlejohn (1996) states that, "Theorist of the Actional tradition are reluctant to seek universal laws because they assume that individual behavior is not governed entirely by prior events." Rather, scholars seek the rules humans generate to govern and guide behavior. Uses and Gratifications is a mass media theory that exemplifies the Actional view -- how do people choose to use the media, what gratifications do they seek and obtain? Other examples include semiotics, symbolic interactionism, rhetorical criticism, and feminist and cultural studies. The scientific paradigm tends to make the Nonactional assumption, particularly in the hard sciences such as physics and chemistry. The humanistic and critical paradigms adopt the Actional assumption. Epistemology Issues of epistemology deal with how people know what they say they know. How should one go about studying the world? What is meaningful evidence? It

is that branch of philosophy that studies knowledge. Basic epistemological questions are: To what extent can knowledge exist before experience? By what process does knowledge arise? What constitutes a meaningful statement about reality? How does one separate fact from fantasy? Littlejohn (1996) discusses two broad worldviews that encompass differing epistemological positions: Worldview I and Worldview II. Worldview I asserts that reality is distinct from the human being and therefore awaits to be "discovered". Objective methods that involve value-free verifiability are the means necessary to generate meaningful knowledge about the world "out there." Worldview I is scientific, deductive, Nonactional, and strives to be value free. Objectivity is the key word. Worldview II views the world in process. People take an active role in creating knowledge. Littlejohn (1996) says that, "Worldview II attempts not to uncover universal laws but to describe the rich context in which individual operate." Worldview II utilizes inductive inquiry; it is necessarily subjective, and studies human actions as opposed to reactions. The epistemological assumptions the humanistic and critical paradigms are best categorized as Worldview II. Subjectivity is the key word. Axiology Axiology is the branch of philosophy that examines values of the researcher and the extent to which such values enter into the research process. Regarding axiology, Littlejohn (1996) suggests three issues that are important to the communication scholar: 1. Can theory be value free? 2. To what extent does the practice of inquiry influence that which is studied? 3. To what extent should scholarship attempt to achieve social change? The scientific paradigm strives to be value free. The humanistic and critical paradigms are clearly value laden, with some critical scholars being valuedriven. But realistically, no research is value free. Researchers create their own values through their own experiences and conventions (Wotring, 1996). Wotring lists four reasons why any research cannot be value free:

1. There is a philosophical issue as to whether sensory experience is an accurate reflection of reality. 2. Any observer brings a frame of reference to his or her observations -- sensory experience-- which has a direct influence on those observations. Everyone must select what to observe, how to observe it, how to interpret what has been observed, what to remember about it (short and long term), and how to act on the observations. All of these selections are not random, but deliberate, taught, socialized, and in some cases perhaps instinctive. Psychologists suggest that humans develop "frames of reference" to help them organize their perceptions into meaningful patterns. 3. Researchers use theory as a frame of reference for their observations. They too must decide what to observe, etc. 4. As society changes over time, so do the questions and theories and methods employed by researchers. The needs and values of society directly impact the focus of research. **************************************************************** A Handy Chart Adapted from Creswell, 1994 (This is an oversimplification and depicts ideal types) Scientific Ontology Humanistic Actional Critical Actional

Nonctional

Epistemology Objective Subjective Subjective Worldview I Worldview II Worldview II Axiology Value-Free Value-Laden Value-Driven

**************************************************************** ** References Creswell, J. W. (1994). Research Design: Qualitative and Quantitative Approaches, Thousand Oaks, CA: Sage Publications. Littlejohn, S. W. (1996). Theories of Human Communication, Belmont, CA: Wadsworth Publishing Company.

Wotring, C. E. (1997). Class Notes: Research Methods in Communication Research, Tallahassee, FL: Florida State University, College of Communication. Copywrite 1997 Daryl Wiesman and C. Edward Wotring

Section II.
Communication Among Researchers: Professional Associations, Conferences and Journals

Major Academic Associations and Journals in Communication Association for Education in Journalism and Mass Communication (AEJMC)
o o

Journalism and Mass Communication Quarterly Journalism Monographs

Broadcast Education Association (BEA)


o

Journal of Broadcasting and Electronic Media

International Communication Association (ICA)


o o o o

Journal of Communication Human Communication Research Communication Theory Journal of Applied Communication

National Communication Association (NCA)


o o o o

Spectra [the NCA newsletter] Quarterly Journal of Speech [deals primarily with issues of speech and rhetoric] Communications Monographs [empirical studies in human communication] Critical Studies in Mass Communication [critical/cultural studies]

o o o

Journal of Applied Communication Research Communication Education Text & Performance Quarterly

Conducting a Literature Search


o o o o

Decide on a topic of interest Find recent articles on that topic; check suggestions for future research in the conclusions section; use the references to generate a bibliography Generate a list of "key words" for use with various sources Determine which journals, databases, or reference books would be most useful for your subject of interest by asking reference librarians, professors, or fellow students or from published descriptions of available sources:  Academic Journals in Communication: (see above list)  Computer services: For example, LUIS, Sociofile, PsychLit, Comabstracts, ComIndex, Dialog, Lexus, Nexus  World-Wide Web sites: there are lots! A very relevant one is CIOS (Communication Institute for On-Line Scholarship). Also, most of our professional associations have web pages now.

Many of these services and sources are free to students and accessible through the University library system and our own computer lab, or your home computer with modem. Using these and other services, a thorough review of all pertinent literature on any given subject may be completed. Step-by-step instructions for each of these services are available to students. Contents of Journal Articles Two Vitally Important Terms -- Know the Distinction: Rationale The purpose of the rationale is to show that the hypothesis to be tested is a reasonable expectation based on the literature in the field, a logical extension of what is known from other research. It's purpose is to show the theory being tested by the hypothesis. Justification The purpose of the justification is to show what the study is important to do, that it studies an important issues. A study is important for one of two reasons - either it attempts to solve some social or practical problem (applied research)

or it generates, tests and/or extends theory (theoretic, basic or pure research). Many studies do both.
Major Sections of a Journal Article, Thesis, or Dissertation

Title
o

o o

Very important that it contains the correct key words that accurately describe the variables being studied, particularily of a separate list of key words aren't listed by the journal. This is important for electronic literature searches. Be careful with your own titles -- think them through carefully.

Introduction
o o o o o o o o

State and justify problem or research question Justification is of to types: either the study is important for theoretic reasons Pure/basic research) and/or it is important for practical reasons (Social/applied research) Historical background of the problem Central thesis, hypotheses Relevance to communication Methods sketch Outline of other sections/chapters

Literature Review/Rationale
o

o o

Builds the theoretic basis for the study. Reviews literature supporting propositions/theory from which the hypotheses are derived or research questions founded Specifies hypothesis or research questions, i.e., specific expectations/predictions Rationale: the reason for your hypothesis or expectations which should be developed from the previous literature cited; links the hypothesis(es) to previous literature

Methods
o

Subjects, sample, participants: How were these units (people, animals, magazines, content, organizations, etc.) selected? What sampling method was used? What /who was observed; What generalizations can be made?

o o

Operational Definitions: What manipulations and measures/observations were performed? What was the reliability and validity of those manipulations/measures; How were what/who observed? Research Design: What kind of research was performed -- true experiment, quasi-experiment, descriptive research, etc.? How were extraneous variables controlled? General Procedures: Step by step how study was carried out (for replication purposes Analytic Procedures: How will data/observations be analyzed? This subsection can be here or at the beginning of results

Results
o o o

Describes the data/observations that have been collected Application of various statistical procedures, descriptive and inferential, to the data to test the hypotheses and/or answer research questions Data-based answers and results of hypotheses tests; little interpretation

Conclusions
o o o

o o

Summary of study to this point -- purpose, expectations, how data were gathered, what were results Unoperationalize the data Implications - the "so what" section; what is meaningful about the results; what are their practical and/or theoretic implications? Based on the literature review and expectations, what do we now know that we didn't before? Limitations of this study Future research

Overall: Propose and defend a conceptual problem, translate/operationalize it into observable data, analyze the data, translate/unoperationalize/interpret the data back to the conceptual problem Research Proposal The research proposal (your final paper) contains: I. Introduction II. Literature Review and Rationale including hypotheses and/or research questions

III. Proposed Methods IV. References, and any endnotes or appendices


Writing Guidelines

Here is my version of the relevant sections of a journal article, doctoral dissertation/master's thesis, research report or proposal. With some modification, the sections are similar across document types. Since space is a limiting factor in journal articles, sections tend to be abbreviated, with the Introductory chapter from the dissertation usually reduced to an abstract in the journal article. The exact sections, sub-sections and their order depend to a great deal on one's major professor and the research design being employed (broadly, empirical, interpretivist, or critical). In the case of grant-related research proposals, sections depend on instructions given in the granting agency's request for proposals (RFP). In the case of journal articles, it depends on the style required by that journal (usually this is specified within the journal or a style sheet is made available). At any rate, here are the normal sections of an empirical journal article: 1. Title (See above). 2. Abstract. 3. Introduction (Combines the first two chapters of a thesis or dissertation): Statement of problem & justification, literature review, rationale and statement of hypotheses and/or research questions. 4. Methods: Subjects/participants/sample, operational definitions/questionnaire construction (including reliability and validity estimates), research design, general procedures. 5. Results: Description of data (descriptive statistics); application of descriptive and/or inferential statistical procedures to test hypotheses and/or answer research questions. 6. Conclusions: Theoretical and/or practical implications of findings, limitations, future research. 7. References, footnotes, appendices, etc. Here are the sections of a doctoral dissertation or Master's thesis: (Title, committee signatures, abstract, acknowledgements, table of contents, list of tables, etc.)
Chapter I,

Introduction: Purpose/Statement of problem, justification, background/context, central thesis, relevance to communication,

methods/design sketch, outline of rest of dissertation.


Chapter II,

Literature Review and Rationale (sometimes these are two separate chapters): Introduction (what literatures/topics will be reviewed, in what order, making what points), literature(s) review with internal summaries, rationale leading to a the statement of hypotheses and/or research questions.
Chapter III,

Methods and Procedures: Introduction, Subjects/participants/sample (who or what were studied and how were these people, objects, institutions, text, etc. selected), operational definitions/instrument construction including reliability and validity checks, research design, general procedures followed in data collection, and (sometimes) analytical procedures.
Chapter IV,

Results: Introduction, (sometimes) analytical procedures, data description (application of descriptive statistics), hypothesis testing and providing answers to research questions, exploratory analyses (application of inferential statistics).
Chapter V,

Conclusions: Summary of studies purpose, thesis, methods, and results; Implications (theoretical and/or practical implications of the study's findings to the state of the literature/theory being tested or extended the social problem being investigated; limitations of this study in its focus, sample, operational procedures and design; recommendations for future research. (Footnotes, endnotes, references/bibliography, appendices, etc.) As I will emphasize in class redundundancy is appropriate and terribly important for theses and dissertations. Follow the old adage ("Tell'em what your going to tell'em, tell'em, then tell'em what you just told'em"). Use introductions and summaries for this purpose. Also, what may seem redundant to you doesn't appear that way to a first time reader of your paper. Use the examples of papers in this reader; go to the library and read dissertations and theses of recent communication graduates. A dissertation/thesis proposal should contain the first three chapters, with the Methods chapter written in the future tense. Your dissertation or thesis committee signs this document when it is approved -- the proposal is, in effect, a contract between you and the committee thereby delimiting the scope of your study. It is placed in your department file.

For the your final paper -- the research proposal, you will submit these three chapters plus references and any additional supporting materials. The document will be the same one you are working on in the Thesis Helper course and hopefully will serve a first draft towards your actual thesis or dissertation proposal. Once you choose a major professor, you should discuss your research ideas with her/him. In addition to the proposal, I would like you to append an internal/ external validity critique of your research design. I will discuss this in class.
Examples of Research Proposals

Proposal Examples

Section III.
Defining Your Interests: Scientific Definitions Concepts
o o o

A concept is an abstraction An abstraction is a grouping of otherwise dissimilar objects A scientific term is a label of a concept

Constructs
o o o

A construct is a concept that is not directly observable A construct does not have a direct reference in the real world A construct must be measured through indicators

Types of Constructs

There are three types of constructs:


o

Hypothetical: concepts that are hypothesized to exist but cannot be directly observed; examples are "internal mediators" such as attitudes, values, and beliefs

o o

Ideal Types: An ideal state or type; e.g., liberal or conservative, extrovert or introvert, perfect vacuum Constructions: A combination of concepts/constructs created by the scientist for the purpose of explanation/prediction; e.g. social-economic status (SES). Such constructions do not exist in observable reality

Explication
o o

The process of defining scientific terms to make them acceptable for a scientific dictionary The explication process of defining a term has two parts, a conceptual definition and an operational definition

Conceptual/Constitutive Definitions
o o o o

Defines the attributes, the structural and functional characteristics, of a term Creates an abstraction Nominal definitions; specifies what characteristics constitute a concept/term Renders a term meaningful in a scientific dictionary; links the term to other terms of a given language/dictionary; defines the term using other terms out of the same dictionary (psychological, sociological, chemistry, etc.); these other terms specify the structural/functional attributes an object must have to be a member of the concept/class being defined

Evaluation of Conceptual Definitions


o

Conceptual definitions are evaluated using three criteria:

1. Clarity: The extent to which a term is understood and the ease with which we can clearly separate what is defined from what is not
o

Three aspects of clarity: 1. Specificity or determinacy: How clear, how specific, are the attributes defining the term? (inversely related to scope/breadth) 2. Uniformity of usage: Within the specific field, or science, does everyone define the term in the same way? How consistent is the usage?

3. Avoidance of "Lay"Language: Common English terms should be avoided in order to be precise; this is difficult in the behavioral sciences, particularly communication 2. Scope/Breadth: How broad is this concept? What is included and excluded? At what level of abstraction is this concept? (inversely related to specificity or determinacy) 3. Systematic Import: Does the term relate systematically to other terms in propositions/hypotheses? Does the term relate to other terms in the field? Does the term help explain and predict the phenomena of interest? Does the term work in the hypothesis? Is the term useful to the theory? If it helps us understand what we want to understand it is systematically important, i.e., it has systematic import; if not, it does not have systematic import
Operational Definitions

A set of procedures which, when performed, produces an instance of the concept in the real world Necessary to render the term observable and make the concept empirical Two types of operational definitions: 1. experimental/ treatment/ manipulation 2. measured (nominal, ordinal, interval or ratio levels; qualitative or quantitative)
Evaluation of Operational Definitions

Based on four criteria: 97. Formal Clarity: Can the operationalization be replicated? Are the steps used to manipulate or measure the concept described in enough detail that you could perform them? (important for replicability and reliability) 98. Correspondence: Given whatever is being manipulated or measured, do the procedures in fact produce an instance of the concept being studied? Do the procedures measure what is intended to be measured ? (Validity) 99. Significance (or typicality): Is this a typical example of the concept? Do the procedures produce an important instance or trivial instance of the concept? (Validity) 100. Concept/Construct Independence: Are the operational procedures independent from those used to operationalize other (similar) concepts? Could the procedures be used to measure another concept? Do the procedures measure/manipulate this and only this concept? To the

extent the procedures can measure another concept, then the two concepts are not independent of one another. That makes the language vague.

Section IV.
Exploring Ways of Understanding Your Interests: Propositions, Theory, and Hypotheses
Proposition
o o o o o

A statement of relationship between or among two concepts Test propositions with data Are hypotheses; none can ever be "proven" absolutely true Vary in abstraction level Occur under certain conditions

Typologies to evaluate the nature of the relationship

Typology 1 Causal or Associated


o o o

Is there merely an association between two concepts or is there a causal relationship? If the relationship is determined causal, it may be controllable Causality is difficult to determine

Directionality
o o

Can be positive or negative Positive relationship: as a change occurs in one variable, a similar change occurs in the other (as x goes up, y goes up or as x goes down, y goes down) Negative/inverse relationship: the variables change in an opposite manner (as x goes up, y goes down)

Shape
o o

The curve of the relationship between the graphed variables Used to describe the relationship mathematically

Relationship may be linear or nonlinear, i.e., it can take any shape. We try to describe that shape mathematically

Strength
o

The percent of variance in Y explained by X (r2 -- the square of the correlation coefficient between the two variables; this does not indicate causality, only the strength of the relationship. More on this later).

Conditions under which the relationship holds true


o

Most, if not all, relationships among variables found in the social sciences are conditional, i.e., the relationship is contingent on the presence of other variables or conditions

Typology 2 (Hans Zetterberg - On Theory and Verification in Sociology) Reversible or Irreversible


o o

Can the statement of relationship be reversed and retain the meaning? i.e., A leads to B and B leads to A vs. A leads to B but B never leads to A.

Deterministic or Stochastic
o o o

Deterministic: the relationship always occurs Stochastic: the relationship occurs a certain % of the time i.e., A always leads to B vs. A sometimes leads to B.

Sequential or Coextensive
o o

Sequential: if x then later y will occur Coextensive: if x then immediately y will occur

Sufficient or Contingent
o o

Will the relationship occur by itself or is it contingent on another variable? i.e., if A then B regardless of anything else vs. if A then B only if C is present

Necessary or Substitutable

o o

Is one necessary for the relationship or will other variables have the same reaction? i.e., is TV violence necessary for children to be aggressive or are there other variables that will make them aggressive? If A and only if A, then B vs. if A then B, if C than B

Theory
o o o o

A set of interrelated propositions from which a testable hypothesis can be deduced A general proposition Can explain past and present reality and predict future realities Most important criteria for a theory: utility in prediction and explanation, parsimony, symmetry

Hypothesis
o o

A specific prediction based on the theory Two logical methods:

Deduction: Theory is developed from general to specific; i.e., develop hypothesis from theories; theory testing (the hypothetical-deductive model, standard scientific paradigm) Induction: Hypothesis is developed from specific to general; i.e., develop generalizations from specific observations; theory creation; grounded theory; ethnographic research
Verification and falsification of theory
o o o

o o o

Test the hypothesis by operationalizing it and collecting data which supports or fails to support the hypothesis Never directly test a theory, support it through the testing of logical hypotheses Theory can never be PROVEN true in all possible cases (verified) - it can only be supported in those cases tested; we can, however, falsify a theory through hypothesis testing If a theory says that something should occur and it does not, the theory is flawed as constructed (or the methodology of the study was flawed) Replication is important Supporting a hypothesis is necessary to verify the theory but not sufficient to verify the theory

o o

Researchers place a high value on falsifiability A theory must be able to be falsified. If a theory is not falsifiable in principle, it is not really testable and (from an empirical viewpoint) it is worthless

Criteria to judge a theory


o

NOT truth: In place of judging theory on Truthfulness, we use these criteria: 0. Utility: Does it work when tested?  predictability  explanatory power  replicability 1. Symmetry: Does the theory make logical sense? Is it logically valid? 2. Parsimony: logical efficiency, simplicity These criteria are weighed differently for Positivistic, Interpretivist and Critical paradigms. E.g., Qualitative researchers may not look for the most parsimonious explanation because they are looking for a rich understanding of phenomena.

Some More Notes on Propositions and Theory

Propositions are statements of relationship between or among two or more concepts. The aim of science is to establish such general relationships among properties in nature. The more general, and the more universal in application, the better. The properties are concepts or constructs, and the relationship will be discussed below. The test of propositions is in data, i.e., operationalization/observation. Propositions vary in level of abstraction and accordingly, names. (Presumptive hypotheses, theoretic statements, general principles, predictions, hypotheses, theorems, axioms, etc.). However, all propositions are hypotheses in that none can ever be proven absolutely true. Most propositions hold under certain specific conditions. Do not confuse propositions with definitions.
The Nature of Relationships.

There are several characteristics of relationships that we wish to know. 144. Causal vs. associational. We tend to be more interested in causal relationships because if we can determine the causes of a given variable, we can then have some degree of control over it. 145. Direction of the relationship (+, -, 0). 146. Shape of the curve of relationship between various values of x and of y (a graph). From this we can determine if the relationship is linear or curvilinear, and depending on other factors, we can describe the relationship mathematically. 147. Strength of the relationship, usually described as % of variance explained in y by x (and vice-versa). 148. Level of generality or abstraction of the concepts/constructs that are being related. 149. The conditions under which the relationship holds. All relationships only work under certain conditions. As if these weren't enough, we can further our understanding of relationships be considering these additional characteristics developed by Zetterberg (referenced in the recommended readings at the beginning of the syllabus. These are not completely independent from those listed above. Relationships can be described as one or the other of each of the following characteristics: 150. 151. 152. 153. 154. Reversible or irreversible; Deterministic or stochastic (probabilistic); Sequential or coextensive; Sufficient or contingent; and Necessary or substitutable.

For example, the hypothesis Exposure to televised violence will cause an increase in post-viewing aggressive behavior is most likely irreversible, stochastic, sequential, contingent and substitutable.
Theory, Explanation, and Rationale

Simply stated, a theory is a general proposition. A hypothesis is a specific prediction based on the theory. We then test the hypothesis by operationalizing it and collecting data which supports or fails to support the hypothesis, which in turn supports or fails to support the theory. In other words, we never directly test a theory; rather, we deduce a specific consequence of the theory and then test to see if that consequence occurs as expected. If it does, we have supported the theory at least in this one instance. After many such tests

we become more and more comfortable with the theory, but can never prove the theory true. That would take an infinite number of tests. A rationale is that part of a research proposal, thesis, dissertation, research article, etc. in which a hypothesis, to be tested by the study, is shown to be a reasonable expectation based on previous research; i.e., the hypothesis is shown to follow from (is deduced from) a more general principle (theory). There is a review of literature which should provide the basis for making the hypothesis. The literature review is normally structured around a theory or general principle or set of principles. Studies that are reviewed are organized by those supporting or failing to support the theory. The hypothesis should be consistent with and an extension of what is already known. It should follow from the theory. A theory is a general proposition or set of logically interrelated general propositions from which testable hypotheses can be deduced. The testable hypotheses are predictions based on the theory. The simpler the theory (parsimony) the better. Also, the number of hypotheses that can be generated by the theory is that theory's predictive power. The hypotheses is the prediction made by the theory. By the same token, the theory provides the explanation for the hypothesis. The theory explains the hypotheses; the theory provides the explanation for all other hypotheses that can be deduced from the theory. Rationale and explanation mean the same thing (as do thesis and central argument). The theory is the rationale for the hypotheses; the theory is the explanation for the hypothesis. While the theory is the explanation for this specific hypothesis, its explanatory power refers to how much of a phenomena in nature it purports to explain, and how well it seems to explain the phenomena in hindsight.Predictive power refers to how well the theory works when put to the test; does it make predictions (hypotheses) that are supported by empirical data? Example: Hyp.: Teenagers exposed to television programs in which teens wear peculiar clothing and dance in funny ways will themselves wish to behave in these manners. (This is stated imprecisely, but you get the idea). Theory: The key theoretic principle here is the relationship between symbolic

reinforcement and behavior (social learning theory). People don't have to be themselves rewarded -- if they see another person rewarded for a particular behavior, they will be more likely to perform that or similar behaviors. This effect is enhanced if the observed persons receiving the reward are models (people with whom the observer identifies). Rationale: Here we deduce the hypothesis from the theory by defining the televised teens as models, and the clothing and dancing as behaviors that are being reinforced on the program (acceptance by peers, particularly acceptance by people of the opposite sex). The presence of these subtle reinforcers is important here. To the extent that the viewers themselves want these same rewards (and they should by definition), then the viewers will want to perform these same or similar behaviors. Therefore: Teenagers exposed to television programs in which....... The rationale links the hypothesis to the theory by showing that the hypothesis is a particular instance of the theory. It is an example of the more general theory. If the theory is true, then the hypothesis should be true, and specific observations should support the hypothesis. The particular terms in the hypothesis are examples of the more general terms in the theory (e.g., the televised teens are examples of models, dancing together and smiling/hugging are types of symbolic reinforcements, dancing and dress are the specific behaviors being rewarded).
Relationship of Theory, Hypothesis and Data

There are two stages of deduction taking place. These are referred to as higher order explanation and lower order explanation. The hypothesis is deduced from (explained by ) the theory. This is called higher-order explanation. The specific data for a study are deduced from (an instance of) the hypothesis. This is called lower-order explanation. The outcome of the study is explained by the hypothesis (the data is what you predicted), and the hypothesis is, in turn, explained by the theory. THEORY (Social Learning) ] ] } Higher Order Explanation/Rationale ] ]

HYPOTHESIS (Exposure to teen dancing programs and subsequent behaviors of the viewers) ] ] } Lower Order Explanation, Operationalization ] ] DATA (Measures of specific behaviors of a particular sample of teens following exposure to a particular televised teen dance program like that stupid one on channel 33) NOTE: The data (study outcome) are/is an instance of the more general hypothesis. The hypothesis is an instance of the more general theory. ANOTHER DEFINITION OF THEORY: A theory must have 1) Presumptive Hypotheses (theoretic propositions or higher-order statements -- really the theory proper), 2) A Dictionary (containing conceptual and operational definitions which ensure that the theory is empirically based), and 3) A Calculus or Syntax (in our case the system of deductive logic with which to deduce testable hypotheses; in the physical sciences this system would be mathematics). Some references include 4) A Model, which is a symbolic representation of the theory.
Verification vs. Falsification

I have said that it is impossible to truly verify a theory, i.e., prove it true. It would take an infinite number of observations -- past, present and future -- to prove a general principle true. However, falsification is possible. If the theory says something should occur, and it doesn't in even one case, then the theory is false as it is presently configured. Supporting a hypothesis in any particular study is necessary to verify the theory from which the hypothesis is deduced, but not sufficient to verify the theory. Falsification can take place in one hypothesis test, assuming the data clearly does not support the hypothesis and that there are no fatal flaws in the study itself. In practice, replication is important. It would take several studies all demonstrating non-support to modify or scrap the theory. But in principle, falsification is possible where verification is impossible. For this reason, the notion of falsifiability is very important to researchers. Theories must be falsifiable and all tests are done in

an attempt to do just this. If a theory isn't falsifiable, then it really isn't testable and is worthless -- from the empirical viewpoint.
Types of Theories

Reynolds (referenced in the syllabus) discusses three types of theories (so does Littlejohn, the text in COM 5401): A. Set of Laws. This type of "theory" is a laundry list of low-level relational statements that stay pretty close to the data. The "set" means that they are similar in that they all refer to the same general phenomena. However, they are not hierarchically related. A good example of a set of laws is Skinner's Learning Theory. Some people would argue that set of laws isn't a theory at all because here is no higher-order explanation for the laws. The laws are no more that well-supported hypotheses without any explanation. As such, laws aren't theories in the proper sense of the term. This doesn't make them unimportant, however. B. Axiomatic Theory. The foundation of this type of theory is the axiom, or higher-order assumption that always remains unproved and is never directly tested. All other propositions in the theory are derived from these axioms or assumptions. Geometry is an example, along with some of the social science theories. This type of theory is particularly popular in cognitive psychology where hypothetical constructs abound. These constructs are assumed to exist and are placed in higher-order theoretic propositions. Testable hypotheses are deduced which contain only observable variables that can be directly measured. C. Causal Processes. Here the theory is a specifications of causal relationships among a set of concepts/constructs. In a sense it is a model of a phenomena. Usually it is tested using some of the newer multivariate statistical techniques such as path analysis or log-linear modeling. What is interesting about this approach is that it is not piece-meal in the testing process. It tries to test the whole model at once.
Where do theories come from?

This sounds like the beginning of a bad joke. However, I believe that theories are creative explanation that people dream up to make sense the happenings that surround them. In the scientific arena, these theories must be clearly explicated and rigorously tested. Theory creation begins with observation. Then

the observations are organized into concepts (a taxonomy), then the concepts into relational statements. At all stages there is human creativity.
How do theories evolve?

Kuhn (The Structure of Scientific Revolution, referenced in the syllabus) suggests that theories develop and change through: 1. Extension. A theory expands into a more general area, taking in more and more phenomena. 2. Intension. The theory becomes more precise and refined in definitions of concepts and description of relationships. This is depth vs. breadth (above). 3. Revolution. This is the idea Kuhn is famous for. Here an anomaly occurs, which is an event that cannot be explained by the theory but should be explained. In effect, the theory/paradigm is disproved. A whole new paradigm takes over, which explains everything the old paradigm explained and explains the anomaly. Rather than evolution, this is revolution. Examples of such revolutions involve Copernicus/Galileo vs. Ptolemy, Freud vs. traditional psychological theory. Revolutions are rare. Robert Merton, a sociologist an early communication scholar, introduced the three levels of theory: Grand, Middle-Range, and Set of Laws. Grand would be similar to axiomatic theory (above), and Middle-Range is just that. Merton then suggests that Middle-Range theories are evolve to explain Set of Laws, and Grand theories are developed to envelop Middle-Range theories. Most communication theories are at the Middle-Range. This brings us to the last type of evolution -- reduction. 4. Reduction. This means "dumping the specifics into more general categories/theories, and is the process of subsuming Merton talked about. Reductionism or the "reductionist" argument refers to trying to find one overarching theory that explains everything.

Section V.

Studying Your Interests: Data Gathering


1. Selecting Units to Observe: Subjects and Sampling

Survey research is very concerned about representativeness (external validity), so sampling procedures are very important. Experimental research is more concerned about demonstrating causal relationships so control of extraneous variables (external validity) is of primary importance; usually subjects are obtained using non-probability techniques.
A. Probabilistic Sampling
o o

o o o o

Involves random sampling techniques Allows generalizations from a sample to a population with a known amount of error at a certain level of certainty (we are 95% certain that the sample estimate is off by no more than 3 percentage points) The most effective means for selecting representative subjects Avoids researcher biases. Population units are selected by chance and not by the researcher. Permits for estimates of sampling error Requires a sampling frame -- a listing of everyone (every object) in the population. Sample units are selected from this list such that all members of the list (sampling frame) have an equal and independent chance of being selected, i.e., units are selected randomly -- chance determines which population units end up in the sample Non-response bias (non-sampling error) is always a problem. If only a portion of the random sample responds, is the non-response random, or do the non-respondents constitute a particular segment of the intended population? Non-response can bias the results. Always compare sample characteristics (demographics, etc.) to known population characteristics to check for non-response bias

Probabilistic sampling techniques:

Simple random sampling


o o

Assign a single number to each element, not skipping any number in the process Use random number table to select elements for the sample

Systematic sampling
o

Every nth element in the total population list is selected for inclusion into the sample with a random starting point

Stratified sampling
o o

Grouping members of a population into homogeneous strata before sampling Reduces degree of sampling error

Multistage cluster sampling


o o o o

Used when population list does not exist Sample of members (clusters) is selected Members of the selected cluster are listed List of clusters is subsampled

B. Nonprobabilistic Sampling
o o

o o

Does not rely on random sampling techniques Should not be used to make generalizations to a larger population; while a purposive sample may be generalizable, you cannot be as sure as with random sampling When it is either impossible or unfeasible to select a probabilistic sample i.e., when there is no sampling frame of the population (e.g., drug abusers) Less valid than probabilistic sampling Practical

Nonprobabilistic sampling techniques:


o

Purposive Sampling: researcher uses own judgment in selection of members

o o o

o o

Chunk Sampling: a convenient group of people (a class of students etc.) are used as a sample Judgment/Expert Sample: go to an expert to select the sample. Quota Sampling: subjects are selected on the basis of pre-specified characteristics so that the total sample will have the same distribution of characteristics as are assumed to exist in the population. The interviewer is told to find people with those characteristics Volunteer Sample: ask for volunteers to participate, usually through newspaper ads, radio ads, etc. Snowball Sampling: subjects volunteer others to participate

C. Sample Reliability or Reliability of Estimate


o

To determine sample reliability, compute a standard error of estimate (for a sample proportion the formula is:

for a sample mean it is: Theoretically, 95% of all random samples of size n drawn from a given population should fall within 1.96 standard errors from the center of the population (the true mean or proportion you are attempting to estimate) Sample reliability is directly related to sample size. Larger samples have less random sampling error than smaller samples. When the sample size reaches the size of the population, i.e. becomes a census, random sampling error is reduced to zero. This has no effect on validity, or systematic error however. A very large sample drawn from the wrong population will be very reliably wrong. And 100 samples drawn from the wrong population, or incorrectly drawn, will all be consistently wrong. Reliability does not equal validity.

D. Sample Validity or Representativeness


o

Sampling frame and sampling procedures are the critical factors with sample validity or representativeness. Does the sampling frame in fact contain all elements of the correct population? Was the sample correctly drawn from the right population? Is there any non-response bias?

Sample size does not affect representativeness. A very large sample drawn from the wrong population will be inaccurate; it will be reliably wrong.

E. More Detailed Notes on Sampling Theory Reliability and Validity

Reliability generally refers to freedom from random error -- precision in measurement or sampling; consistency; replicability; the extent to which you are measuring or sampling something consistently. Validity generally refers to freedom from systematic error or bias -- accuracy of measurement or sampling; the extent to which you are getting at the "true" measure or the "true" population parameter in the case of sampling; correctness; are you measuring what you intended; is the sample representative of the intended population. Research design can be viewed the same way. Reliability would refer to replicability -- if you repeat the design do you get consistent results. Validity refers to internal and external validity -- did you isolate the "true" cause of an observed effect, what is the generalizability of the findings; for descriptive research, does the study give you an accurate reflection of the phenomena of interest. If a study lacks reliability it cannot be valid. If you aren't measuring or sampling anything consistently, you cannot be measuring or sampling the right thing. On the other hand you can have reliability without validity -you can very precisely measure the wrong thing, very reliably sample the wrong population. The first issue then is whether you are measuring or sampling something consistently; the second issue involves whether the "something" that you have isolated is what you intended -- is it the correct variable or is the sample representative of the right population.
Random and Systematic Measurement Error

If I administered a paper and pencil measurement of Extroversion with you, the measure would result in a score. If I administered the same measure with you several times, unless the measure were perfect I would likely get a range of scores. That range (assuming you haven't changed personality) is the amount of unreliability/unstability/inconsistency in the measurement instrument, also referred to random measurement error. (Reliability is estimated in a number of

ways which we will discuss later, but the most basic measure is a test-retest correlation coefficient, ranging from 0 to 1, and the closer to 1 the better). How should I decide among the several scores you achieved on my Extroversion scale? Most researchers would pick the average as the "best" measure, with the + and - scores above and below it indicative of random measurement error, or unreliability. The validity issue is: does this average score accurately reflect your actual level of extroversion?? Is the score "correct"? (There are a number of ways to check this -- content, expert, comparisons with other measures of similar variables, etc. We will discuss them later). If it isn't accurate, its wrong, and even if the measuring instrument is reliable, its still wrong, reliably wrong, every time we use the measure. It is systematically wrong -- hence "systematic" error. Whatever it is measuring, it isn't extroversion.
Random and Systematic Sampling Error

The same logic applies with sampling. Assume we want to know how many hours students at FSU watch television per week. We draw a random sample of 100 students and ask the question in a survey. The mean response for the sample is x hours. If we drew several samples, unless something weird happens, we will get a range of means. This range of values is indicative of random sampling error, or unreliability. How should we decide among the several sample means (estimates)? Most researchers would take the average of the several estimates (the mean of means here) with the means above and below the average indicating random sampling error, otherwise termed "error of estimate". The validity issue is: does the average number of hours reflect the actual number of hours FSU students watch TV per week?? Is the estimate "correct"? Again, if it isn't its wrong, and even if the estimate is reliable, i.e., across the several samples we are getting similar results, we are getting reliably wrong estimates, systematic error. We somehow have gotten a biased sample, which can happen for any number of reasons.
Inferences and Survey Research

Some definitions: Statistic -- a summary measure of a sample (mean, proportion, standard deviation, etc.) Parameter -- a summary measure of an entire population (mean, proportion, standard deviation, etc.)

The purpose of survey research (and inferential statistics) is to estimate a population parameter based on a sample statistic with a known amount of error at a specified level of confidence. We want to estimate the mean number of hours FSU students watch television per week (the population parameter) based on the mean number of hours reported by a representative sample of students (the sample statistic) with a known amount of error (random sampling error/error of estimate expressed as a number of hours the sample estimate is likely off) at a specified level of confidence (usually 95%, meaning we are 95% sure we aren't off by any more than the number of hours). The same goes for estimating a proportion. We could estimate the percent of FSU students supporting the U.S. military activities in Haiti (the population parameter) based on the percent of support among a representative sample of students (the sample statistic) with a known amount of error (random sampling error/error of estimate expressed as the percentage points the sample estimate is likely off) at a specified level of confidence (usually 95%, meaning we are 95% sure we aren't off by any more than the percentage points). Connie Chung on CBS evening news might report that "a recent survey of FSU students showed that 55% supported U.S. military activities in Haiti, and that figure is in error no more than 3%" (she normally won't add "and the researchers are 95% confident the estimate is off by no more than 3%" because the 95% confidence is assumed -- it is the convention). The 3% is the amount of random sampling error, also called the "error of estimate" which I will show you how to calculate below along with the level of confidence.
Normal Distributions

More definitions: = A population mean Ppop = A population proportion x = A sample mean Variance = The variation of scores around the sample mean calculated as (x x)2/n-1 s.d. = a sample standard deviation calculated as variance

n = the sample size, the number of cases in the sample

normal curve = the so-called bell shaped curve, the shape of the frequency distribution of scores which are normally distributed random sampling = selection of cases from a list (a sampling frame) of all possible cases such that each case has an equal and independent chance of selection The normal distribution has certain mathematical properties. 1) The mean, median and mode coincide and sit at the center of the distribution. 2) There is fixed relationship between ranges of standard deviations and the proportion of cases in the distribution: x 1 s.d. includes 68% of all scores in the distribution x 1.96 s.d. includes 95% of all scores in the distribution x 3 s.d. includes 99% of all scores in the distribution For example, the mean number of hours FSU report watching TV per week is 15 hours, with a standard deviation equaling 2 hours. In a normal distribution of scores (hours reported by individual students): 15 hrs 1 s.d. ( 2 hrs) includes 68% of all scores in the distribution 15 hrs 1.96 s.d. ( 3.9 hrs) includes 95% of all scores in the distribution 15 hrs 3 s.d. ( 6 hrs) includes 99% of all scores in the distribution We can then say that 95% of FSU students in this distribution watch between 11.1 and 18.9 hours. More importantly, if you were to select one score from this distribution, how sure are you that it will fall within 11.1-18.9 hrs?? You can be 95% sure, provided the score was selected randomly from this normal distribution. If you don't randomly sample, all bets are off, so to speak.
The Logic of Sampling Theory

Here is where all of this leads, and the logic of it is both simple and inescapable provided you understand the above. Let us (said Tom, crisply) propose to actually do a survey of FSU students to determine hours of TV viewing per week, or computer use, or you name it. We can obtain a random sample (list of names and addresses) from the registrar. The big decision we have to make is sample size. This involves a number of factors such as expected variance in viewing behavior and what's known in the business as a power analysis. Basically, you need to decide how much error (the figure) you can tolerate. The bigger the sample, the less the sampling

error. In fact as the sample becomes the entire population (a census) there is no random sampling error at all (the figure becomes zero). However, sample size also directly relates to time and money so we want the smallest sample we can get away with. We decide on a sample size of 100. Before we collect any data, we know the following information is true: There are 1.5 zillion non-redundant samples size 100 we could possibly draw at random from the population of 29,000 + FSU students. Each of those samples could produce a mean hours of TV viewing per week. All those 1.5 zillion means produce, guess what, a normal distribution. It has a special name -a sampling distribution of the mean. At the center of this distribution is the mean of these means, x, and provided all the samples are theoretically drawn randomly from the list of all FSU students, the mean of means, x, equals , the population parameter we are trying to estimate (x = ). The distribution shows us how far off all possible estimates can be from the true population mean -- it is a distribution of sampling error. Furthermore, it has a standard deviation which has a special name -- standard error. Since this is a normal distribution we already know the relationship between percent of all estimates and the s.e.: x = 1 s.e. includes 68% of all estimates in the distribution x = 1.96 s.e. includes 95% of all estimates in the distribution x = 3 s.e. includes 99% of all estimates in the distribution We are going to draw one sample from all possible samples size 100, 95% of which fall no more than 1.96 s.e. from the true population parameter. That says that before we collect any data, we know we have a 95% chance of being with this defined error range. Can we calculate the s.e. from our one sample? Yes. Here are the formulas: s.e. of the mean = sample s.d. / n s.e. of the proportion = pq/n So we draw our one sample of 100 FSU students and complete our survey. The mean hours of reported TV viewing per week is 15, and the s.d. = 2 hrs. The s.e. then is 2hrs/ 100 which equals 2hrs/10 which equals 120minutes/10 which equals 12 minutes. So, the s.e. = 12 minutes. To be 95% certain, we need to add and subtract 1.96 s.e. to and from our mean of 15 hours, which is 23.5 minutes.

Dave Brokaw reports: "A recent survey of FSU students estimates that they view an average of 15 hours of television per week; the researchers are 95% certain that this estimate is not off by more than 23.5 minutes. " He could also say that the average FSU student watches between 14 hrs 56.5 min. and 15 hrs 23.5 min. Or he could say between 14 1/2 and 15 1/2 hrs. If we want to have a higher level of confidence, we could use 3 s.e. or 36 minutes. We can now be 99% confident, but we have a wider error band. The only way we can reduce the error band but remain at the same level of confidence is to increase the sample size. Quadrupling sample size cuts random error by 1/2. All of this works provided we have a random sample. If we didn't use random sampling procedures, we can't be sure these probabilities hold. And, the only way to be 100 percent sure is to sample everybody, i.e., do a census. Also, for finite populations, there is a "finite population correction factor" that adjusts estimates accordingly. If the population is 200 car dealers and you sample 100 or 1/2 of them, the correction factor greatly reduces the error band. Also, for small samples (under 120) the number of standard errors necessary to includes various proportions of the distribution changes (we use a t distribution instead of a z distribution).
An Example Using Proportions

Again we sample 100 FSU students and ask whether they support U.S. military involvement in Haiti. They can agree, disagree or be neutral/don't know. p = proportion who agree; q = 1-p (everybody else). We could also ask who they support for governor, with p being the proportion supporting Chiles, and q being everyone who doesn't; or p is the proportion supporting Bush, and q being everyone else. Of our 100 respondents, 55% or .55 support the U.S. military involvement in Haiti. p = .55, and q therefore is .45 (1 - .55). The standard error of a proportion is pq/n which works out to 0.05 or 5%. To be 95% certain we need to add and subtract 1.96 s.e. or 1.96*5% or 9.8% to our 55% estimate. Hulk Hogan reports: " In a recent survey of FSU students, 55% say they support U.S. military involvement in Haiti; the researchers are 95% certain that this estimate is not off by more than 9.8 percentage points. "

Note that this is alot of error with the true percentage falling between 45% and 65%. Either a clear majority is in support, or it clearly isn't. If we were polling voters for a candidate, this is way too much error. What can we do?? We could use 5% (one s.e.) but then we would only be 68% sure. To maintain 95% level of confidence, and reduce error, we need a larger sample. With a sample of 400, pq/n becomes 2.5% so we cut the error band in half.
Non-Sampling Error

There is unfortunately some problems with all of the above. Everything holds provided: 1. We have an accurate sampling frame. Usually there are errors, the registrar always misses some students, etc. We must make sure we draw the sample from the right population. 2. We have to have a reliable and valid questionnaire. 3. Students have to answer honestly. 4. ALL MEMBERS OF THE SAMPLE HAVE TO RESPOND. And, they don't. They never do, unless its a prison population. Phone surveys typically are lucky to get 1/3 response rate (so to end up with 100 responses we would have to sample over 300 students). Mail is usually around 5-10 percent. While we can be 95% sure that the sample of 100 are within 1.96 s.e. from the true mean, what about the 30 who actually complete the survey? Or the 100 of a sample of 300?? There is a major validity problem here, and it is called nonresponse bias. When only a proportion of a total sample respond, are they peculiar respondents? Has a significant segment of people of one type or another refused to respond? Does the non-response significantly bias the results of the survey?? Researchers have to defend that the drop-out was random. You have to compare your results to census or other survey data. This is why demographics should always be included in a survey--you can compare the characteristics of those responding to known population data in an attempt to show that non-respondents didn't bias the nature of the sample. Replicability helps here as well.
2. Observing the units: Operational Definitions, Quantification and Measurement. Quantification

o o

The process of assigning numbers to variables so that the two are "isomorphic" -- as you put more gas in the car, the measure in gallons increases proportionally Constant -- an attribute of a concept that takes on only one value. In a laboratory experiment we try to hold many variable constant across treatment conditions (e.g. noise, heat, light) Variable -- an attribute of a concept that takes on more than one value. Quantification is assigning numbers to these various values Purposes of quantification are: 0. precision: numbers are more precise than words (I put some gas in the car vs. I put 10 gallons in the car); I need a medium size carpet vs. I need a carpet 15X20 ft. 1. numbers allow the application of descriptive statistics -- you can summarize large amounts of data into averages and variances provided the data is in meaningful numbers; a main function of statistics 2. numbers allow for the application of inferential statistics -- you can make inferences from a sample to a population with a known amount of error at a specified level of certainty if the data is in meaningful numbers -- a main function of statistics Process of quantification/measurement: 0. Operationalize the concept into a reliable and valid measure 1. Collect data using the measure 2. Process the data using appropriate descriptive and inferential statistical tools 3. Report numerical results 4. Unquantify the data back to the original concept; interpret the results; compare the results to other data, historical data, trends, etc. Quantification is a tool -- it allows you to process data in ways you could not do unless it is in numeric form; however, the tool has it's limitations and it must be translated back to the concept of interest. The translations and interpretations are dangerous and where people get into trouble for misinterpreting and over interpreting, etc. (three out of four doctors recommend.....)

Levels of measurement
o

Determines the appropriate procedures and statistical tests which can be used to analyze data

Four "levels" of measurement:

1. Nominal
o o o o o o

Numbers are assigned as labels to unordered mutually exclusive categories. The numbers are used as names, hence the name "nominal." Lowest level of measurement, not measurement in the true sense of the word. Distinctions among categories are qualitative, not quantitative; in kind rather than amount. gender, country of origin, religious denomination, etc. are examples Numbers can be assigned but assignment does not indicate a hierarchical relationship; i.e., female = 1 and male = 2 or visa versa) Appropriate statistics include mode, frequencies and percentages and correlations and tests designed for nominal only data; nonparametric nominal statistics. Formal properties: exclusivity, exhaustiveness (there is one and only one category for every response)

2. Ordinal
o o

o o o

Numbers are assigned to ordered mutually exclusive categories Categories, and the assigned numbers stand in relation to each other; higher ordered categories should be assigned higher numbers. Still is not true measurement. Distinctions among categories qualitative in nature Can be expressed in terms of the algebra of inequalities: a is less than b (ab). Military rank, size of names in film credits, shirt sizes (small, medium, large, petite, junior, etc.), rank ordering objects by height from shortest to largest, are examples Appropriate statistics include those for nominal scales and additionally median, range, semi-inter quartile range and correlations and tests designed for nominal and ordinal level data; nonparametric nominal and ordinal statistics Formal properties: nominal properties and asymmetry (if a > b, then no member of category b cannot be > any member of a) and transitivity (if a >b and if b > c, then a > c)

3. Interval
o

Numbers are assigned to number of units, all units being of the same size

o o

Holds the characteristics of Nominal and Ordinal plus exact differences between categories can be specified and an arbitrary zero point is assumed True measurement. Distinctions are quantitative rather than qualitative; differences are in amount, not kind Examples include the Fahrenheit and centigrade temperature scales, Likert type attitudinal scales, summated attitudinal scales, personality inventories Appropriate statistics include all those for nominal and ordinal, plus mean, standard deviation, variance, and correlations and tests designed for nominal, ordinal, and interval levels. Parametric and nonparametric statistics depending on other assumptions Formal properties include those for nominal and ordinal levels plus addition and subtraction

4. Ratio
o o o

o o

Numbers are assigned to equal size units with a real, non-arbitrary zero point (no units) All of the characteristics of Interval measurement plus the zero point Examples include most physical measures (gallons, yards, feet, liters, the Kelvin temperature scale, age, multiple-choice tests (under certain assumptions), word counts and ratios, income Appropriate statistics are the same as for interval level measurement Formal properties include those for nominal, ordinal and interval levels, plus multiplication and division (ratios -- Nancy is twice as old as Bill)

Higher vs. lower order scales


o o

Ratio/interval can be converted to lower order scales Ordinal/Nominal while the reverse is NOT true Nominal and Ordinal use nonparametric measures while Interval and Ratio use parametric measures if other assumptions are met.

Reliability and Validity Reliability


o o o

Consistency; freedom from random error Replicability -- could similar results be produced? If more than one sample were drawn, what is the range of estimates?

o o o

Random sampling error is a measure of reliability; i.e., the error estimate (how far off the sample is likely to be) is a measure of consistency Reliability does not equal accuracy A measure can be reliable and not valid; it can be consistently invalid. The same is true for samples Ways to check the reliability of a measurement scale/procedure:  test-retest, over time reliability, coefficient of consistency  for summated scales-- internal consistency, coefficient of homogeneity (split halves; odd-even, item- total (item analysis), Cronbach's coefficient alpha)  for alternative forms-- coefficient of equivalency  if using raters/coders, etc.-- interrater/intercoder reliability, coefficient of agreement, Scott's pi  for mechanisms, e.g. weight scales, heart rate monitors, tape recorders, etc. -- tolerance; machine specifications

Validity
o o o o

Accuracy; freedom from systematic error; freedom from bias; correctness Is the sample representative of the right population? is it accurate/valid? Validity is very difficult to demonstrate; it is always a problem Sampling frame and sampling procedures are the critical factors with sample validity. Was the sample correctly drawn from the right population? Is there any non-response bias? With a measurement validity, does it measure what it was intended to measure? With a manipulation, did it manipulate the correct variable? Ways to check the validity of a measurement scale:  content/face/expert/judgmental validity -- a logical validity check; does the range of content cover the domain of the concept being measured; does an expert agree that the content is valid; does it look like it measures what is intended?  concurrent validity -- a criterion related validity check, and empirical; how well does this measure correlate with similar measures of the same or related concept?  predictive validity (for measures that purport to predict future behavior, such as the SAT, GRE, entrance exams, job skills, etc.)-another criterion related validity check, and empirical; how well does the measure correlate with the actual future performance?

construct validity -- a logical and empirical validity check. Within a theory, certain concepts should logically related with others in predictable ways. So, does a measure of concept A correlated with a measure of concept B like it is supposed to according to the theory? If it doesn't, then either the theory is wrong or the measure of A or B or both is invalid. To determine the validity of an experimental manipulation, do a manipulation check; e.g., show the television violence and nonviolent treatments to subjects and ask what they think about the relative amount of violence. Or show the treatments to experts, etc. Make sure you are manipulating what you think you are.


Three Questions to ask about any Operational Definition or Number before trusting conclusions or applying statistics

1. What is the level of measurement of the number? 2. What is the reported reliability of the operational procedures? 3. What is the reported validity of the operational procedures?
Types of scaling: Three major measurement traditions

1. Consensual location scaling/Thurston scaling:


o

Purpose is to determine how people perceive differentiate a set of objects on some defined dimension/scale; e.g. how children rate cereals on a sweetness scale, how voters rate political candidates on a liberal/conservatism dimension Rather than putting the respondents on a dimension, the interest is where the respondents locate the objects (vegetables, candidates) on the dimension. What is the consensual location of the objects according to the sample of respondents? Respondents/judges are given the set of objects and the dimension; they are asked to rank order the objects, rate the objects, or sort the objects along the dimension (the method used depends on the number of objects and difficulty of the task; whether comparative or absolute judgments are required) The average of ratings/rankings/sortings of all respondents determines the location of each object on the scale relative to one another. Standard deviations (dispersions of scores) indicated relative agreement among respondents for any given object. Both indicators are useful

o o o

Reliability/consistency/stability of the locations of objects is directly related to the number of judges/respondents. Generalizability depends on how the sample of judges was selected. The roots of this type of scaling is from psychophysics -- the scientific study of human perceptions (psychology and physiology). Weber's ratio comes from this tradition. Thurston extended this scaling procedure to the measurement of attitudes. He conceived the Law of Categorical Judgments and the Law of Comparative Judgments. The semantic differential, and Osgood et al. Measurement of Meaning are consensual location procedures Based on Thurston's laws, consensual location scaling is interval measurement and in some cases ratio (inclusion of a real zero point) Q-Sort is a derivative of this tradition

2. Psychometric/IQ/Ability Scaling:
o

Purpose is to place respondents along a dimension/scale. Want to differentiate respondents according to their abilities, attitudes, intelligence, learning, etc. Subjects are asked to a series of items all measuring the same dimension (e.g., multiple choice examination items). Items are scored either dichotomous (right/wrong, agree/disagree) or using Likert-type response categories indicating intensity (strongly agree, agree, neutral, disagree, strongly disagree) The responses to all items are summed together to produce a score which places the respondent along a continuum representing the dimension being measured (number of items correctly answered on the multiple choice test) Reliability of this measurement is directly related to the number of items presented to the respondent. The more items, the more reliable the total score of each respondent Level of measurement is a matter of debate. Some consider it interval, even ratio; others say it is ordinal. The decision should be made according to assumptions about the response categories and precedence - what level have other researchers ascribed to the set of scales.

3. Guttman scaling (scalogram analysis/cumulative scaling)


o

The purpose is to place respondents along an attitudinal dimension

Items are written that themselves are rank ordered in intensity of expressed attitude within the each question/statement (scalogram analysis) Respondents are instructed to indicate which items they agree with and at which point along the attitudinal dimension (at which item) they no longer agree. Where they stop agreement indicates the attitudinal position of each respondent. Resulting scores/positions of respondents are only ordinal. Cumulative scaling developed by Guttman is at the ordinal level of measurement. With the advent of powerful parametric statistical techniques and accessible computers to analyze parametric data, Guttman scaling has lost popularity.

Designing interview and survey questions


o

o o o o

o o o o o

Several considerations when designing interview/survey questions: 0. The types of questions to be asked 1. The way questions are constructed 2. The question format that will be used 3. The population of interest (children, adults, felons, etc.) 4. The order of questions 5. Clarity of instructions 6. The data collection procedures; self administered with or without a proctor (e.g., mail), vs. telephone vs. face to face The choice of what questions to ask determines the responses given; i.e., Knowledge questions (What do you know about Candidate X) will elicit a different answer than feeling questions (Did you feel more or less positively about Candidate X after you saw the commercial?) The types of questions posed depends on the researcher's intent Need to be appropriate, meaningful and nonbiasing, and written for the appropriate population Use simple, straightforward, clear language Avoid double-barreled questions: double-barreled questions ask about several issues at once (I am in favor of a better education and a strong military) Avoid leading questions: a question that begins with "Don't you think that... " is leading Avoid using emotionally charged terms that can bias responses. People often respond as they feel they are supposed to (social desirability). Respondents must be competent to answer the questions Short questions are the best Avoid negative items, they may be leading and confusing.

o o o

Researchers can use directive or non directive questions:  directive questions: use closed questions that limit kinds of answers (yes-no)  non directive questions: use open ended questions (How do you feel about... ) If possible, use contingency questions - allow respondents to answer questions which are relevant to them by providing directed questions (if yes, skip to question 9). Construct questions with similar answer responses into clusters Notice the order the questions are asked Place demographics and other sensitive questions at the end

3. The conditions of data collection: Selecting a Research Design Purpose of Research Design:
o o o o o

To guide the process of scientific inquiry To provide a structured plan or strategy for the collection of data needed to answer specific questions To clearly detail what is to be observed and analyzed, as well as why and how that observation and analysis will take place. To control extraneous variables and test causal inferences To help insure objectivity

Causality
o o o

Implies that one variable has a forcing quality over another Cannot be demonstrated directly Before it can be established, three criterion must be met: 0. Must show that there is a relationship between the variables beyond chance (concomitant relationship), or that the variables co-vary; 1. Must demonstrate a consistent time order (independent variable occurs before dependent variable); 2. Must control for extraneous variables by physically or statistically ruling out other possible explanations for the relationship

Internal and External Validity of Research Designs Internal Validity

Internal validity is a factor important for research that attempts to draw causal inferences and test causal hypotheses. It is not a factor with purely descriptive research that does not make such inferences. Internal validity is the extent to which the research design rules out all extraneous variables as possible explanations for observed differences and thereby reducing the chance of drawing inaccurate conclusion from experimental results. It is a validity issue in that you are trying to determine if you are attributing the effect to the right cause, i.e. the treatment and not something else. The possible "something elses" are listed below. They are also known as extraneous variables, or alternative hypotheses.
Sources of Internal Invalidity:
o o o o o o

History Testing Regression Mortality Diffusion or imitation of treatments Compensatory rivalry

o o o o o o

Maturation Instrumentation Selection Causal time-order Compensation Demoralization

External Validity

External validity is the extent to which the research findings reflect real life, and it decides how generalizable those results are. The results might be an accurate reflection of what took place in the experiment (internal validity), but still not have any real world meaning. Here again, the importance of generalizability depends on the purpose of any particular study. There are four main factors which affect external validity:
o

Sampling Procedures: The sampling procedures used will decide to whom or what your findings can be generalized. Those designs using random sampling procedures will have greater generalizability. Pre-Test Sensitization: Subjects may become sensitized to the treatment due to the pre-test. To counter this effect, the pre-test should be masked or eliminated. The extent to which this has occurred can be measured using the Solomon 4 Group design. Reactive Arrangements: Subjects may act and react differently to a treatment simply because they know they are being observed. These can include demand effects, where the subjects try to determine what is wanted of them and they actively comply, the Harvard effect, where subjects intentionally try to ruin test results, and experimenter effects,

where the researcher influences subject responses through conscious or unconscious bias. Reactive arrangements are frequent in experimental designs. Multiple Treatment Interference: This involves order effects, where subjects response to a treatment is due to exposure to previous treatment, and carry over effects, where psychological or physiological changes from one test are carried over into the next treatment. It is also a problem when a treatment is really a package of treatments, e.g. comparing television violence (an episode of Miami Vice) to nonviolent television (an episode of Lassie). The two treatments differ on many dimensions, not just amount of violence, so any effects cannot be attributed to just the violence. dimension.

Types of Research Designs:


o o o

Non-Experimental, or descriptive, research designs Quasi-Experimental research designs True Experimental research designs

True Experimental Research Designs:

True experimental designs are the best way to determine causality. Generally, they have high internal validity, low external. It is the randomization and manipulation of the independent variable that gives the true experiment its power. Isolation under laboratory conditions adds to this power by reducing extraneous variables and equalizing other experimental conditions for all subjects. One advantage is that true experiments can be repeated several times with multiple subject groups with relatively little time and expense. Types of True Experimental designs:
o o o

Pre-Test/Post-Test Randomized Solomon 4-Group Post-Test Only Randomized

Pre-Test/Post-Test Randomized Design


o o

Subjects are selected using precise scientific procedures Subjects are measured on a dependent variable (pre-test), the experimental group is exposed to some stimulus representing an independent variable, and then both experimental and control groups remeasured on the dependent variable (post-test)

o o

Observed differences between the two measurements on the dependent variable of the experimental vs. control group are then attributed to the influence of the independent variable, which is the purpose of the design Random selection of samples allows the researcher to make relatively few observations and generalize the result to a larger population Exposure to the pre-test can effect the respondents answers to the posttest and worse can heighten (or diminish) the effects of the treatment (pretest sensitization. Generalizing effects to an unpretested population is problem some. External invalidity is usually low.

Post-Test Only Randomized design


o o o

One group receives the stimulus and a post-test, while another group receives the post-test only (second half of Solomon 4 design) If proper randomization is used, there is no need for the pre-test, though this design is not a sensitive to small treatment effects Lack of pretest avoids pretest sensitization

Control Groups
o o o o

Used to control for the effects of all other factors occurring during the experiment One group receives the stimulus while the control group does not Both groups are given a post-test Randomization assures groups are equal (within chance expectations) on all individual differences variables. A statistical test looks for differences between groups greater than chance

Solomon 4-Group
o o

Is a combination of the above designs Tests for amount pretest sensitization. Allows a researcher to decide if treatments (such as educational programs) should include a pretest to maximize effectiveness Allows testing of prior studies using simpler designs to determine how much interaction took place

Field Experiments

o o

One of the true experimental designs is conducted "in the field" outside of laboratory conditions in a real world setting External validity is enhanced, but usually internal validity (especially history) is weakened

Within-Subjects Research Design


o o o

Involves one group of subjects receiving two treatments (or both treatment and control settings Also known as the "repeated measures" or "correlated measures" design Is subject to order effects (one treatment effects the results of the other due to the order of treatments) and carry over effects

Between-Subjects Research Design


o o o

Involves two different groups each receiving one treatment Involves comparing two groups having received different treatments Randomization of subjects to groups must be used to insure comparable group make-up.

Quasi-Experimental Research Designs:

In a quasi-experimental design, there is no random assignment of subjects to an experimental and control group as in true experimental designs, but there is a manipulation
Types of Quasi-Experimental Designs:
o o o o

Non-equivalent Control Groups Time Series Analysis Panel Study Evaluation Research (Formative/Summative)

Non-Equivalent Control Groups


o

Uses an existing "control" group with similar characteristics to an experimental group when it is impossible or unethical to assign subjects randomly from a common subject pool Disadvantage: without randomization, it is impossible to rule out all group differences. History may be a problem too if groups are in different settings.

Panel Study

o o

Involves studying the same group of subjects before and after some treatment or event, or over time ITwo major difficulties involve panel attrition (panel members no longer able to participate for any reason) and accounting for any outside forces (history) other than the treatment affecting panel members between tests.

Time Series
o

Involves observation of an group or randomly sampled different groups over time at equal intervals before, during and after the occurrence of a manipulation. The before measures establish a baseline and fluctuations due to normally occurring extraneous variables; the during measures establish the effect of the manipulation, and the after measures detect the length of effect (short term, long term) The weakness is the possibility of some extraneous variable occurring simultaneously with the treatment, and the expense of multiple measures. If one group is used, testing becomes a potential problem, as well as sensitization.

Evaluation Research
o o o

Purpose of this design: to determine whether an intended result was produced Sometimes called program evaluation Example: To see if a political campaign is having the intended effect, random samples of voters could be drawn at different intervals before and during the campaign to examine observed differences Method can have quasi- or true experimental design characteristics and is a good example of applied research in social science.

Non-Experimental, or Descriptive Research Designs:

Non-experimental designs generally have higher external validity and lower internal validity than other designs. Types of Non-Experimental Designs:
o o o o o o

Historical/Comparative Case Studies Ethnography Natural Group Comparison Survey/Correlational Secondary Data Analysis

o o o o o o

Focus Groups Naturalistic Observation Field Research Developmental/Longitudinal Ex Post Facto/Retrospective Content Analysis

Historical/Comparative
o o o o o

Used to discover patterns in cultures that recur in different times and places Primary resource for observation and analysis is historical records Qualitative method which allows examination of subtle details over time Overlaps somewhat with field research, content analysis and secondary analysis Primary strength is its unobtrusiveness: since the objects of interest do not know they are being studied, they cannot react and alter their behavior Primary weaknesses: the researcher must rely on information generated by others and on their own perception of that information, resulting in purely subjective conclusions

Focus Groups
o o o

Used to explore the depth of an event, attitude, view or belief Requires a group of individuals chosen by certain characteristics, not by using probability sampling procedures Advantages: captures real-life data in a social environment; is flexible; has high face validity; allows quick results; can be lower in cost than other techniques. Disadvantages: researcher has less control of the situation than if dealing with an individual; data can be difficult to analyze; effective moderators require special skills; differences between groups can be troublesome for comparison; assembling groups can be difficult; and a conducive environment must be available for the discussion to be conducted.

Case Studies
o o o

Comprehensive examinations of a particular group or individual Lack generalizability Involves non probability sampling

Naturalistic Observations, Ethnography


o o

The study of what takes place in the real world, i.e. the behavior of people after a hurricane has hit No probability sample is used

Field Research/Field Studies (ethnographies, participant observation)

o o

o o o o o o o o o o o o

Rarely used to collect data to test some specific, previously defined hypotheses Researcher physically puts him/herself in or at the actual point of interest and attempts to make sense of some ongoing process, which is effective for studying subtle nuances of attitudes, behaviors and social processes over time It can be a theory-generating activity More flexible than other methods of research in that the research design can be modified at any time It can be very inexpensive It is qualitative and often yields suggestive conclusions rather than definitive due to a myriad of validity and reliability problems Involve the direct observation of a social phenomenon in their natural settings. Typically yield qualitative data Are seldom entered into with precisely defined hypotheses to test. Are most appropriate in situations where attitudes and behaviors are best understood in their natural setting, and for social processes over time. Allows the researcher a more comprehensive perspective and therefore more validity than lab experiments. Are flexible in that the researcher is able to modify the research at any time Can be fairly inexpensive. Yields qualitative data which disallows precise descriptive statements about a large population. Therefore the results are often seen as suggestive rather than definitive. Has less reliability and less generalizable results than those of lab experiments. The purpose is an in-depth description of a phenomena at a point in time or over a period of time from the perspective of a particular observer.

Natural Group Comparison


o

Compare two or more naturally occurring groups on some variable(s) of interest; e.g. males vs. females in salary, U.S. vs. Germany on violent crime rate, high, middle and low SES groups on voting behavior. It is difficult to make valid causal inferences concerning the grouping variable, e.g. that gender is the cause of salary inequity. While the two groups are certainly different on gender, they may also be different on any number of other variable that may be the real cause (e.g. level of education, years of experience, etc.). If one is trying to make such an

inference, other important variables must by controlled by selection (matching), statistical controls, or other method.
Developmental/Longitudinal
o o

Designed to allow observations over an extended period of time to measure gradual effects and effects of maturation Types of:  longitudinal (study the language development of 3 yr. olds until they are ten, a seven year study)  cross-sectional (measure the language development of a sample of 3,4,5...and 10 yr. olds -- a relatively quick study involving the combination of a survey and natural group comparison)  semi-longitudinal (a combination of the above -- study 3 yr. olds until they are 5, 4 yr. until they are six, etc. for each group of children, a two year study again involving a natural group comparison but over time)

Survey/Correlational
o o o o

Used to describe characteristics of populations by generalizing from a sample Measurement tool is the standardized questionnaire which has strong reliability and flexibility Strength is generalizability -- external validity Weakness is low internal validity -- it is difficult to control extraneous variables and therefore to make causal inferences

Ex Post Facto/Retrospective
o o o

Refers to the development of hypothesis for the predicting of relationships that have already occurred May result in hypotheses that had not been previously examined Involves analyzing data that was previously collected by someone else for other purposes, e.g., school records of student academic achievement

Secondary Data Analysis


o o o

Involves the examination of data generated by others Is faster and cheaper than other methods Allows the examination of data generated by experts

Validity problems: Original study may have been done for totally different reasons and the data therefore might not be appropriate

Content Analysis
o

o o o o o

Involves the examination of a class or sub-class or social artifacts (newspapers, magazines, books, television or radio programs, paintings, photographs, etc.) Can be relatively inexpensive Can be easier to replicate if done correctly Allows the study of events and processes occurring over a long period of time Seldom has any effect on the object of study, and therefore has strong reliability Is limited to examination of reported communications only

Reliability and Validity of Research Designs


o

Reliability of a research design is a matter of replicability. If someone else follows your procedures using the same research design, do they achieve comparable results? Consistency is the issue. The validity of a research design involves two validity issues:


Internal validity. If the study is testing causal hypotheses or making causal inferences, how well does the design isolate the correct cause of any observed effect? Internal validity is not a factor for purely descriptive research designs which are not attempting to show causality. External validity. How generalizable are the findings from this research design? To what populations, treatments, settings, etc. can the results be projected. Again, depending on the purpose of the study, this may or may not be an important issue.

Section VI.
Your Results: How To Analyze, Interpret, and Report Your Observations

Results
o o

Provide thorough description of data (in terms of level of measurement, quality, etc.) The application of statistical procedures to answer research questions and/or test hypotheses

Statistics

Basic uses of statistics:


o

To Describe - Statistics can be used to describe the results of tests on a sample in terms of central tendency (mean, median, mode), dispersion (standard deviation, variance), relationships among or between variables (correlation coefficients), predictions (regression, LISREL analysis), or differences between treatment groups (chi-square, t-tests, anovas, etc.. To Infer - These same statistics can be used to make inferences from the sample to a larger population, the difference being that inference introduces a random sampling error band around all estimates (means, proportions, standard deviations, correlations, beta weights, etc.).

Which statistical test should be used depends on the purpose (to describe or infer) and the assumptions that can be met (the level of data and normalcy of distribution). Application of the wrong test will produce uninterpretable and incorrect results. There are also ethical considerations.
Parametric statistics:
o o

Have a greater statistical power than non-parametric at a fixed sample size, i.e., they are more sensitive to differences than non-parametric Three assumptions must be met before applying parametric statistics to data: the data must be interval or ratio level, population distributions must be normal and variances in populations must be homogeneous. In other words, assumptions are made about population parameters; hence the term parametric statistics

Non-parametric statistics:

Non-parametric statistics do not make assumptions about population parameters. They are also called "distribution-free" statistics since no assumption of normality is made. Should only be used with nominal or ordinal data and for interval data with abnormal distributions and/or large differences in variances (there are tests to determine normality (chi-square Goodness of Fit test) and homogeneity (F max test)

Statistical Testing: Deciding between a null and alternative hypothesis (also refer to other notes in reader on statistics)
p value:

The p value is the probability that the observed difference between treatment groups in an experiment or an observed correlation is due to chance. It is the result of the statistical analysis of your data.
power analysis:

The purpose of power analysis is to determine the sample size needed. To reduce sampling error, use a larger sample. It answers the questions, "How large should my sample be?" It is determined by type I and type II error rates and the size of difference or correlation you are expecting to find.
Null hypothesis:

The assumption that there is no relationship between the two variables in the total population, that is, that the observed difference found are chance. In English, it states that two things are equivalent; there is no difference between conditions, groups, etc. beyond random sampling error. A statistical test is used to decide whether to reject or accept the null.
Alternative hypothesis (or research hypothesis):

The assumption that there is a relationship between the variables being studied, or that the observed differences or correlations are indeed real, and not just a chance finding (bigger than due to chance alone). The null and alternative hypotheses are either accepted or rejected based on the p value statistically arrived at.

Type I Error ( alpha ): The rejection of the null hypothesis when it is actually true. That is, saying something does exist when in reality it does not. This is considered the worse of the two errors to make. The probability of a type I error is called alpha or a. It is normally preset set at .05 -- we don't want to make this error any more than 5 out of 100 times. Type II Error (beta): The probability of accepting, or failing to reject, the null hypothesis when it is actually false. That is, saying something does not exist when in reality is does. The probability of a type II error is called beta or . 1is the probability of correctly rejecting the null, or power. Power is directly related to sample size; the bigger the sample the greater the power. Sample size also directly relates to research expense, so it is important to do a power analysis to determine the minimum sample necessary. Type I error can be minimized by setting a low alpha, but this increases the likelihood of making a Type II error. You can reduce Type II error without affecting Type I error by increasing the sample size.
Effect Size (ES):

How big of an effect you are expecting to find -- e.g., size of difference between means of an experimental and control group; size of a correlation coefficient. Cohen in his book Power Analysis for the Social Sciences provides three handy effect sizes: small, medium and large. Once effect size and type I error rate are set, power and n are directly related.... the larger the sample, the greater the power (of detecting that effect size). Another way of saying this is - the smaller the effect you are trying to detect, the larger the sample you will need to find it (the needle in the haystack, so to speak). On the other hand, if you expect a huge effect, you don't need much power to find it. Alpha , Beta, ES, and n fit together into an equation such that once three are set, the fourth is determined.
n:

n is the sample size. The larger the sample size, the greater statistical power. n size can be determined using power analysis.
p:

The p value is the probability that the observed difference between treatment groups in an experiment is due to chance, and it is produced by the statistical

test selected to analyze the data. It is determined by comparing the observed difference with the difference expected by chance alone (Observed Chance). The p value can range from 0.00 to 1.00. A p of 0.50 would mean that your results are 50% likely due to chance. How the decision is made: After doing a power analysis, you collect the data. You then apply an appropriate statistical test to the data. If the resulting p value calculated from the statistical test is equal to or less than the preset alpha level (alpha = .05; p = .05, .04, .001, etc.) you reject the null. The probability the results are due to chance is sufficiently low (at or below alpha). If the resulting p value is greater than alpha (alpha = .05; p = .055, .06, .10, .50 etc.) you fail to reject the null; your experiment failed; the probability your results are due to chance are too high. (Here you hope you did the power analysis correctly; you hope you had enough sample size to detect a real difference if it were there).
Conclusions Section of an Article
o o o o

Provide a summary or recap of the study, through results Translate findings and discuss practical and theoretic implications -significance Delineate limitations of study including problems with method, design, etc. and reminders regarding interpretation of data Provide useful suggestions and direction for future research based on study

Misuse of the Term "Significance"

In the conclusions, the authors say they achieved "significant" results or the study was "significant", or the findings were "significant". Sounds important, but what might they really mean? The term "significance can take on three distinct, and mutually exclusive meanings. Be careful to pinpoint which an author is using and be clear in your use of this term:
o

Statistical significance: Used to denote an observed difference found beyond chance alone. Given sufficient sample sizes, even minor differences and small correlations will be statistically significant - very

unlikely due to chance. Statistical significance, even very low p values (.0001) do not denote big differences or big correlations, they just mean it is very unlikely the results are due to chance, nothing more and nothing less Theoretical significance: If the results, statistically significant or not, add significantly to our understanding of a theory, supporting or refuting or delineating it in a new, provocative or meaningful way. The failure to reject a null hypothesis, providing the study was correctly carried out, may be very important in falsification of a theory Practical significance: if the results of a study have important real-world, social, practical implications. Here again, there may be practical implications without statistical and/or theoretical significance, or in addition to either or both

***All researchers think that their findings are important. Don't trust their judgment. Read the entire study and decide for yourself if their conclusions are warranted. No scientific decisions are right or wrong -they are only more or less defensible.
Last, but Not Least...Ethics
o

o o

o o

Ethics involve professionalism in decisions you make concerning a number of issues. I will focus on three: treatment of research subjects, animal and human; analysis and reporting of data; professional conduct with students and the public Associations and organizations such as the American Psychological Association (APA) have their own guidelines for ethics As with other professions such as medicine, much of what you do as a professor is not supervised, at least on any regular basis. We rely on ethics and professionalism to ensure you always behave according to the highest standards. Periodically some professor gets caught lying about data, earned degrees, publications, etc., but fortunately these cases are rare. Data: with regard to data, it would certainly be unethical to make up data, or to tamper with or create results Subjects: it would certainly be unethical to harm or alter subjects emotionally, or physically, to violate confidentiality and their anonymity; any researcher should debrief subjects, correct any misinformation given for study purposes, and attempt to leave the subject unchanged as a result of the study. There are exceptions for

medical studies, de-sensitization studies, etc. and in these cases subjects must know exactly the risks and you have them sign waivers in quadruplicate. Most universities have human subjects committees that must approve any research involving humans. See the Thesis and Dissertation Guidelines available from the graduate school for more information. General Professionalism: all researchers and/or instructors should not use their status for unprofessional uses; e.g., do not intimidate students or subjects, do not swap favors for grades, using one's title to gain political influence, etc.

Section I Section II Section III Section IV Section V Section VI COM 5312 Home Page Assignments [14K]

You might also like