Professional Documents
Culture Documents
Chapter
Thirteen
Chapter Thirteen
Linear Regression and Correlation
GOALS
When you have completed this chapter, you will be able to:
ONE
Draw a scatter diagram.
TWO
Understand and interpret the terms dependent variable and
independent variable.
THREE
Calculate and interpret the coefficient of correlation, the coefficient
of determination, and the standard error of estimate.
FOUR
Conduct a test of hypothesis to determine if the population
coefficient of correlation is different from zero.
Goals
13- 3
FIVE
Calculate the least squares regression line and interpret the slope
and intercept values.
SIX
Construct and interpret a confidence interval and prediction
interval for the dependent variable.
SEVEN
Set up and interpret an ANOVA table.
Goals
13- 4
A Scatter Diagram 30
Sales ($thousands)
25
The Independent
The Dependent Variable
is the variable being Variable provides the
predicted or estimated. basis for estimation. It
is the predictor variable.
Correlation Analysis
13- 5
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Zero Correlation
13- 9
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
r= S(X – X)(Y – Y)
(n-1)sxsy
Formula for r
13- 11
Coefficient of Determination
13- 12
Dan Ireland, the student body
president at Toledo State
University, is concerned about
the cost to students of
textbooks. He believes there is
a relationship between the
number of pages in the text and
the selling price of the book.
To provide insight into the
problem he selects a sample of
eight textbooks currently on
sale in the bookstore. Draw a
scatter diagram. Compute the
correlation coefficient. Example 1
13- 13
Book Page Price($)
Introduction to History 500 84
Basic Algebra 700 75
Introduction to Psychology 800 99
Introduction to Sociology 600 72
Business Management 400 69
Introduction to Biology 500 81
Fundamentals of Jazz 600 63
Principles of Nursing 800 93
Example 1 continued
13- 14
100
90
Price ($)
80
70
60
400 500 600 700 800
Page
Example 1 continued
13- 15
Page Price
Example 1 continued
13- 16
S(X – X)(Y – Y)
(a) (b) (c) (d) (c)*(d)
Page Price Page - Mean(Page) Price-Mean(Price) .
500 84 -125 4.5 (562.5)
700 75 75 -4.5 (337.5)
800 99 175 19.5 3,412.5
600 72 -25 -7.5 187.5
400 69 -225 -10.5 2,362.5
600 81 -25 1.5 (37.5)
600 63 -25 -16.5 412.5
800 93 175 13.5 2,362.5
7,800.0
Example 1 continued
13- 17
r= S(X – X)(Y – Y)
(n-1)sxsy
7800
= 7(138.87)(12.21)
Example 1 continued
13- 18
Step 1 Step 2
H0: the correlation in the Significance
population is zero. level is .02.
H1:The correlation in the
population is not zero.
Step 4
H0 is rejected if
Step 3
t>3.143 or if t<-3.143
The statistic to
use follows the or if p < .02. There are
t distribution. 6 degrees of freedom,
found by n – 2 = 8 – 2 = 6.
13- 20
Step 5
Find the value of the
test statistic.
t = r n- 22 = .657 8 – 2 = 2.135
1- r 1 - .6572
p(t > 2.135) = .077
where
Y’ is the average predicted value of Y for any X.
a is the Y-intercept.
It is the estimated Y value when X=0
Regression Analysis
13- 23
sy
b=r
sx a = Y – bX
Regression Analysis
13- 24
Develop a regression sy
equation for the b=r
information given in
sx
example 1 that can be = (.657) 12.21
used to estimate the 138.87
selling price based on
= .0578
the number of pages.
Example 1 revisited
13- 25
Example 1 revisited
13- 27
The formula
that is used to
compute the
standard error:
sy x =
(Y-Y')2
n-2
Actual price
(Y)
Estimated
price
(Y') (Y-Y')
Deviation
Deviation Squared
(Y-Y')
2
sy x =
(Y-Y')2
n-2
84 72.28 11.72 137.41
75 83.83 -8.83 78.03 593.33
99 89.61 9.39 88.15 =
72 78.06 -6.06 36.67
8-2
69 66.50 2.50 6.25
81 78.06 2.94 8.67
63 78.06 -15.06 226.67 = 9.944
93 89.61 3.39 11.48
0.00 593.33
Example 1 revisited
13- 29
Y' + t(sy x)
1 + (X-X)2
n (X-X)2
Y’is the predicted value for any selected X value
X is an selected value of X
X is the mean of the Xs
n is the number of observations
Sy.x is the standard error of the estimate
t is the value of t at n-2 degrees of freedom
Confidence Interval
13- 31
Example 1
Y’ the predicted value, is $89.61 13- 32
X is 800 pages
X is 625, the mean of the pages
n is 8, the number of observations
Sy.x is 9.944, the standard error of the estimate
t is 2.447 at 8-2 degrees of freedom and 95% confidence
89.61 + 9.944(2.447) 1
8
+
(800 - 625)2
135,000
=89.61 + 14.433
Y' + t(sy x)
1 + (X-X)2
n (X-X)2
Example 1 revisited
13- 33
Y' + t(sy x)
1+
1 + (X-X)2
n (X-X)2
1 (800 - 625)2
89.61 + 9.944(2.447) 1 + 8 + 135,000
=89.61 + 28.292
Prediction Interval
13- 34
Regression Analysis
The regression equation is
Price = 43.4 + 0.0578 No of Pages
M
Predictor Coef StDev T P I
Constant 43.39 17.28 2.51 0.046 N
No of Pages 0.05778 0.02706 2.13 0.077
I
S = 9.944 R-Sq = 43.2% R-Sq(adj) = 33.7% T
A
Analysis of Variance
Source DF SS MS F P
B
Regression 1 450.67 450.67 4.56 0.077
Error 6 593.33 98.89
Total 7 1044.00
Example 1 revisited
13- 36
Regression Statistics
Multiple R 0.657
R Square 0.432
Adjusted R Square 0.337
E
Standard Error 9.944
Observations 8
X
C
ANOVA E
Significance
df SS MS F F L
Regression 1 450.67 450.67 4.5573034 0.0767
Residual 6 593.33 98.89
Total 7 1044
Standard
Coefficients Error t Stat P-value
Intercept 43.3889 17.277 2.511 0.0458193
Page 0.0578 0.027 2.135 0.0767009
Example 1 revisited