Professional Documents
Culture Documents
especially for the purpose of inferring proportions in a whole from those in a representative
sample. a branch of mathematics dealing with the collection, analysis, interpretation, and
presentation of masses of numerical data
Descriptive statistics are a set of brief descriptive coefficients that summarizes a given data
set, which can either be a representation of the entire population or a sample. The measures
used to describe the data set are measures of central tendency and measures of variability or
dispersion.
Inferential- Mathematical methods that employ probability theory for deducing (inferring) the
properties of a population from the analysis of the properties of a data sample drawn from it. It is
concerned also with the precision and reliability of the inferences it helps to draw.
Sample-A selection taken from a larger group (the "population") so that you can examine it to
find out something about the larger group.
Categorical. Categorical variables take on values that are names or labels. The color of a ball (e.g., red,
green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would be examples of categorical
variables.
Quantitative. Quantitative variables are numerical. They represent a measurable quantity. For example,
when we speak of the population of a city, we are talking about the number of people in the city - a
measurable attribute of the city. Therefore, population would be a quantitative variable.
nominal data, or categorical data that assigns numerical values as an attribute to an object,
animal, person or any other non-number.
ordinal data, which is data that can be ordered and ranked, but not measured, such as levels of
achievement, prizes, rankings, and placements. Similar to nominal data, ordinal data cannot be
multiplied, divided, added, or subtracted.
Nominal: Nominal data have no order and thus only gives names or labels to various categories.
Ordinal: Ordinal data have order, but the interval between measurements is not meaningful.
Interval: Interval data have meaningful intervals between measurements, but there is no true starting
point (zero).
Ratio: Ratio data have the highest level of measurement. Ratios between measurements as well as
intervals are meaningful because there is a starting point (zero).
Continuous Data can take any value (within a range)
Continuous data have infinite possibilities
Discrete data are numeric data that have a finite number of possible values. Discrete Data can
only take certain values.
A probability sampling method is any method of sampling that utilizes some form of random
selection. In order to have a random selection method, you must set up some process or
procedure that assures that the different units in your population have equal probabilities of
being chosen. Humans have long practiced various forms of random selection, such as picking a
name out of a hat, or choosing the short straw. These days, we tend to use computers as the
mechanism for generating random numbers as the basis for random selection. probability
sampling: simple random, systematic, and stratified sampling
non-probability sampling, subjects are chosen to be part of the sample in non-random ways.
three non-probability sampling methods - convenience, quota, and judgmental sampling.
Non-probability Sampling: Sample does not have known probability of being selected as in
convenience or voluntary response surveys
Simple random sampling is the basic sampling technique where we select a group of subjects
(a sample) for study from a larger group (a population). Each individual is chosen entirely by
chance and each member of the population has an equal chance of being included in the sample.
Every possible sample of a given size has the same chance of selection.
Slovinss formula is used to calculate an appropriate sample size from a population.
A method of choosing a random sample from among a larger population. The process of systematic sampling
typically involves first selecting a fixed starting point in the larger population and then obtaining subsequent
observations by using a constant interval between samples taken. Hence, if the total population was 1,000, a
random systematic sampling of 100 data points within that population would involve observing every 10th data
point.
Stratified sampling is a probability sampling technique wherein the researcher divides the entire
population into different subgroups or strata, then randomly selects the final subjects
proportionally from the different strata.
Cluster sampling is the sampling method where different groups within a population are used
as a sample. This is different from stratified sampling in that you will use the entire group, or
cluster, as a sample rather than a randomly selected member of all groups.
Census. A census is a study that obtains data from every member of a population. In most studies, a
census is not practical, because of the cost and/or time required.
Sample survey. A sample survey is a study that obtains data from a subset of a population, in order to
estimate population attributes.
Experiment. An experiment is a controlled study in which the researcher attempts to understand causeand-effect relationships. The study is "controlled" in the sense that the researcher controls (1) how
subjects are assigned to groups and (2) which treatments each group receives.
In the analysis phase, the researcher compares group scores on some dependent variable. Based on the
analysis, the researcher draws a conclusion about whether the treatment ( independent variable) had a
causal effect on the dependent variable.
Data can be presented in various forms depending on the type of data collected. A frequency distribution is a
table showing how often each value (or set of values) of the variable in question occurs in a data set. A
frequency table is used to summarize categorical or numerical data. Frequencies are also presented as relative
frequencies, that is, the percentage of the total number in the sample.
GRAPHICAL METHODS:
Frequency distributions and are usually illustrated graphically by plotting various types of graphs:
Bar graph - A bar graph is a way of summarizing a set of categorical data. It displays the data using a
number of rectangles, of the same width, each of which represents a particular category. Bar graphs can
be displayed horizontally or vertically and they are usually drawn with a gap between the bars
(rectangles).
Histogram - A histogram is a way of summarizing data that are measured on an interval scale (either
discrete or continuous). It is often used in exploratory data analysis to illustrate the features of the
distribution of the data in a convenient form.
Pie chart - A pie chart is used to display a set of categorical data. It is a circle, which is divided into
segments. Each segment represents a particular category. The area of each segment is proportional to the
number of cases in that category.
Line graph - A line graph is particularly useful when we want to show the trend of a variable over time.
Time is displayed on the horizontal axis (x-axis) and the variable is displayed on the vertical axis (yaxis).
1. Measures of central tendency:
Measures of central tendency are measures of the location of the middle or the center of a distribution.
The most frequently used measures of central tendency are the mean, median and mode.
A measure of central tendency (also referred to as measures of centre or central location) is a summary measure that
attempts to describe a whole set of data with a single value that represents the middle or centre of its distribution.
There are three main measures of central tendency: the mode, the median and the mean. Each of these measures describes a
different indication of the typical or central value in the distribution.
Frequency
54
55
56
57
58
60
The most commonly occurring value is 54, therefore the mode of this distribution is 54 years.
The population mean is indicated by the Greek symbol (pronounced mu). When the mean is calculated on a distribution
from a sample it is indicated by the symbol xx (pronounced X-bar).