Professional Documents
Culture Documents
10:42 AM
Correlation = measure of strength and direction between strength and direction of linear relationships
Values of r -1 to 1
Regression line Explanatory variable (x) - horizontal axis
Response variable (y) - vertical axis
Homework
Regression line - explains how values of response variable change in relation to the value of the explanatory variable
Use line - predict value of the response variable for explanatory variable
Need mathematical formula
Algebra y=mx+b
m=slope=rise/run=change in y / change in x
b=y-intercept=point where line crosses the y axis
Statistics
y^=b0+b1X - put intercept first than slope
Y hat - predicted values
Values on the regression line
Summarizes the relationship between x and y
Observed values are called y
Points in the scatterplot
Residual values are called e
Residual values are the difference between predicted vs observed values
Error - amount of variation in y model cannot account for
e=y-y^
Most common way to find the place to make the regression line is least squared Puts line where sum of the squared errors as small as possible
y^=b0+b1X
B1=(sy/sx)r
B0=Y_-b1x_z
Regression line always go through
x_ and y_
R is connected to the value of the slope
Predicted value
- predicted gas consumption when degree days is 43
y^ =1.089+0.189(43)=9.216
Predict gas consumption when degree days is 24 - put in 24 instead of 43
Observed is just letter
e=y-y^
How far away observed value - predicted value
JMP
Printout
Standard deviation sy
- Predicted y^
Mean - y_
Standard deviation -s^
- Residuals e
Mean -0
Standard deviation se
Square standard deviation - variance
Sy2=sy^2+se2
Variable in observed values can be separated into - Part explained by the least squares regression model
Sy2
R^2 = sy^2/sy2
Ratio written as a percentage
0 to 100%
Very different - closer to 0, further away from line, and less variability
Closer to line - closer to 100%
R2 can also be a proportion from 0-1
R2 is the percentage of variation in t
If you have r, you can square it to get R
the observed values of the response variable that can be explained with the linear regression model with x
( r) ^2 = R^2
r=+-squarerootR2
Have to look at graph to know if it is + or Jump gives you Rsquared
Outliers in x-direction
Influential point
- Observations that affects the placement of the regression line
Non-influential
X outliers greatly effects result
What to do with outliers
- Make sure data points are recorded correctly
- Collect more data
- Conduct analysis with and without the outlier
Stat 101 Page 2