Stats Review Midterm

Stats Review Midterm 1. Scales of Measurement a. Continuous (entitles get a distinct score) i.
Ratio: the same as interval, but the ratios of scores on the scale must also make sense (anxiety score of 16 is half the score of 8)all types ii. Interval: equal intervals on the variable represent equal differences in the property being measure (the difference between 6 and 8 is equivalent to the difference between 13 and 15) all b. Categorical (entities are divided into distinct categories) i. Ordinal: the same as a nominal variable but the categories have a logical order (whether people got a fail, pass, a merit or a distinction in their exam) frequency and order (median) ii. Nominal: there are more than two categories (whether someone is an omnivore, vegetarian, vegan, or fruitarian) count (frequency) iii. Binary: there are only two categories (dead or alive) 2. Graphs
a.
b.
c.
d.
e.
f.
historgram
g.
scatter plot
h.
boxplot
i.
error bar graph
j. line graph 3. Identify shape of a distribution using a graph relationship between ( mean, median) or the relationship between Standard deviation and Inter quartile range a. Mean/ Median i. Median middle score when scores are ranked ii. Mean: the measure of the central tendency 1. Central tendency: calculate where the center of a frequency distribution one disadvantage of the mean is that it can be influenced by extreme scores. Compare this difference with that of the median. The median hardly changed if we include or exclude extreme scores. Also the mean is affected by skewed distributions and can be used only with interval and ratio data. The mean tends to be stable in different samples though. b. Standard Deviation/ InterQuartiles Range i. Interquartile range: the limits within which the middle 50% of an ordered set of observations falls. It is the difference between the value of the upper and lower quartile. 1. Upper: highest 25 % 2. Lower: lowest 25%
3. Median second quartile) ii. standard deviation: an estimate of average variability ( spread) of a set of data measured in the same units of measurements as the original data. It is the square root of the variance 1. variance: an estimated of average variability (spread) of a set of data. It is the sum of squares divided by the number of values on which the sum of squares is based minus 1 2. sum of squares (SS): an estimate of total variability (spread) of a set of data. First the deviance for rach score is calculated, and then this value is squared. The SS is the sum of these squared deviances 3. deviance: the difference between the observed value of a variable and the value of that variable predicted by a statistical model two main ways of a normal distribution can deviate from normal: o skew: lack of symmetry cluster at one end positive values of skewness indicate too many low scores, negative indicate a build of high scores o kurtosis: pointiness leptokurtic: positive kurtosis, heavy tailed platykurtic: negative kurtosis, flatter, thin tailed the further the value is from zero, the more likely it is that the data are not normally distributed significance tests of skew and kurtosis should not be used in large samples because they are likely to be significant even when a skew and kurtosis are not too different from normal) 4. Assumption for analyze a. Normality: data is normally distributed. We can not simply look at the shape to determined whether its normally distributed or not. The assumption of normality is also important in research using regression. Essentially we can look for normality, visually, look at values that quantify aspects of a distribution ( skew, kurtosis) and compare the distribution we have to a normal distribution to see if it is different i. P-P plot ( probability-probability plot): this graph plots the cumulative probability of a variable against the cumulative probability of a particular distribution. The data ranked and sored. Each rank the corresponding z-score is calculated. It is expected the value that the score should have a in a normal distribution ii. Converting scores to z score is useful then because we can compare skew and kurtosis values in different samples, we can see how likely our vales of skew and kurtosis are to occur iii. Kolmogrov-Smirnov tests and Sharpiro Wilk Test
iv. What is important is not the overall distribution but the distribution in each group v. Q-Q Plot: charts plots the values you would expect to get if the distribution were normal (expected values) against the values actually seen in the data set ( observed values) b. Homogeneity of variance: the assumption that the variance of one variable is stable (relatively similar) at all levels of another variable i. This assumptions mean that as you go through levels of one variable, the variance of the other should not change
ii. heterogeneous iii. although the mean increase, the spread of scores is the same at ach level this is what we mean by homogeneity of variance. iv. Test for it by the Levens test 5. Interrupt KS test, Levens Test, SWilk Test a. Kolmogrov-Smirnov: a test of whether a distribution of scores is significantly different from a normal distribution. A significant value indicates a deviation from normality, but this test is notoriously affected by large samples i. Q-Q plot ii. .05
iii. KS test can be used to see if a distribution of scores significantly differs from a normal distribution b. Levens Test: tests the hypothesis that the variances in different groups are equal( the difference between the variances is zero) It basically does one-way ANOVA on the deviation. A significant result indicated that the variances are significantly different- therefore, the assumption of homogeneity of variances has been violated.
i. ii. in the SPSS table is less than .05 then the variances are significantly different in different groups otherwise homogeneity of variance can be assumed c. Shaprio-Wilk Test : same thing as KS test but it has more power to detect differences from normality ( so, you might find this test is significant when the K-S test is not) 6. SPSS pictures 7. Vocabulary a. Null hypothesis b. Falsifaction c. Frequency distribution d. Alternative hypothesis e. Z-scores f. A level g. B level h. Central limit theorem i. Confidence interval j. Meta analysis k. Type 1 error l. Type 2 error m. Regression line n. Parametric test o. transformation

Stats Review Midterm

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stats Review Midterm

Uploaded by

Copyright:

Available Formats

Stats Review Midterm 1. Scales of Measurement a. Continuous (entitles get a distinct score) i.

error bar graph

You might also like