You are on page 1of 5

Forecasting

Homework Assignment 4
MEMO

1. Draw a graph to represent the relationship. Ensure that bookings represent the
dependent variable (Y-axis) and Income the independent variable (X-axis).

Relationship between Airline Bookings and Income


1500

1400

1300
y = 0.0194x + 371.68
Bookings

1200

1100

1000

900

800
20000 25000 30000 35000 40000 45000 50000
Income

2. Develop a regression model.


Regression Statistics
Multiple R 0.879189194
R Square 0.772973638
Adjusted R Square 0.750271002
Standard Error 78.16734951
Observations 12

ANOVA
Significance
df SS MS F F
Regression 1 208036.3214 208036.3214 34.0477481 0.000164917
Residual 10 61101.3453 6110.13453
Total 11 269137.6667

Standard
Coefficients Error t Stat P-value Lower 95%
Intercept 371.6758379 128.5571439 2.891133286 0.01607593 85.23267088
0.00016491
X Variable 1 0.019381393 0.00332155 5.835044825 7 0.011980518
3. Write down the equation of the regression model.

Bookings= 371.68 + 0.01938(Income)

4. Describe the relationship between bookings and income in words, also, comment
on the significance of the relationship.

As Income per town increases by $1, the average amount of Airline Bookings will increase by
0.0194. Additionally, due to the P-Value = 0.016 (and therefore less than 0.05) the
relationship has great significance

5. Make a point and approximate 95% confidence interval estimate for another city
with given income of $39020.

Y = 371.68 + 0.01938(39020)
= 1127.937789
for 95% confidence interval for predicted value=
1127.94 + 2*(Standard error of estimate)
1127.94 +/- 2*(78.16734951)

Therefore, confidence estimate lays between 1284.27 and 971.60 when income is $39,020

6. Use the ANOVA part of the regression results to calculate the coefficient of
determination.

R2 has a value of 0.772973638. This means 77.297% of the variation in the amount of bookings (y) is
explained by the variation in the change of Income amount(x). The remaining 22,7% is unexplained,
i.e. due to error.

7. Use the ANOVA part to calculate the standard error of the estimate.

ANOVA
df SS
Regression 1 208036.3214
Residual 10 61101.3453
Total 11 269137.6667

Standard Error of Estimate = SQRT(Residual Sum of Squares / Degrees of freedom)


= SQRT(61101.34/10)
=78.167
8. Show how the t-value for bookings was calculated, and show the formulas in Excel
to calculate the p-value for bookings. …. Continue

Standard
Coefficients Error t Stat
Intercept 371.6758379 / 128.5571439 = 2.891133286 T-Value
X Variable 1 0.019381393 0.00332155 5.835044825

=TDIST( T-Value, n-2, 2) = P-Value = 0.016

9. What is heteroscedasticity? Draw an appropriate plot and comment on the


presence (or not) of heteroscedasticity.

Heteroscedasticity is when the variance of the error variable is not constant.

Residuals Plot
150

100

50
Residuals

0
0 10,000 20,000 30,000 40,000 50,000 60,000
-50

-100

-150
Income

There does appear to be an increase in the error variable as income increased, illustrated by
the spread of the plotted points, therefore heteroscedasticity is present.
10. What is autocorrelation? Draw an appropriate plot and comment on the presence
(or not) of autocorrelation.

Autocorrelation = consecutive observations are dependent

Residuals Plot graph


150

100

50

0
1 2 3 4 5 6 7 8 9 10 11 12
-50

-100

-150

Due to there not really being a zig-zag pattern it is difficult to assess whether
autocorrelation is present or not.
But the graph shows an increase over time, therefore an external variable may be affecting
the dataset, as the graph should be stationary.

11. Find the Durbin-Watson value – comment on the presence or absence of


autocorrelation

Model Summaryb
Std. Error of
R Adjusted R the Durbin-
Model R Square Square Estimate Watson
1 .879a .773 .750 78.16735 1.119
a. Predictors: (Constant), VAR00002
b. Dependent Variable: VAR00001

Durbin-Watson is less than 1.5 therefore positive autocorrelation is present.


12. Are there any outliers? Give reasons for your answer.

There are no outliers as none of the Standard Residuals are outside the confidence interval
of 2 or -2.

Standard outlier
Residuals
-1.514441286 no
-1.519469601 no
-0.436767926 no
-0.558442637 no
0.07870575 no
1.328092448 no
-0.66168675 no
0.566937659 no
0.497007431 no
0.719091741 no
-0.108353902 no
1.609327072 no

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Standardized Predicted .253 12 .033 .898 12 .151
Value
a. Lilliefors Significance Correction

Kolmogorov Smirnov test – residuals are not normal, Shapiro Wilk Test – residuals are
normal.
Inconclusive.

15. 15. Comment on whether your regression model is acceptable (conditions listed
on slide 22)

Heteroscedasticity present, autocorrelation present. Residuals might not be normal.


Although R2 high and F-test significant, this model is not acceptable.

You might also like