You are on page 1of 14

Business Econometrics using SAS Tools (BEST)

Class XIII and XIV Regression and Review

Home Price and Size Data


Size in 000 SF; Price in 000 USD Cross Section data for 59 cities across the US Question does price depend on size? If yes, can we predict how?

Import the Data


PROC IMPORT DATAFILE = 'c:\sasdata\sizeprice.xls' DBMS=EXCEL OUT = homeprice replace; RUN; PROC PRINT DATA = homeprice; TITLE 'Home Prices'; RUN;

Plot
Graph the variables 1st step is there a linear relationship? If not, OLS regression is not useful Remember that is why, print the data if small or Summarize the data if large Get a feel of the data set before crunching the numbers

SAS Scatter Plot


PROC GPLOT DATA=homeprice; PLOT Price*Size; RUN;

Relation
A clear linear relation is observed Ocular Estimation gives the go ahead We know we can run a regression Use PROC REG

Regression and Plot


PROC REG DATA=HomePrice; MODEL Price=Size / clb; PLOT Price*Size='+' p.*Size='*' / overlay; OUTPUT OUT=NEW P=PRED; RUN;

Plot with Regression Line

ANOVA
Analysis of Variance Source Model Error Corrected Total DF 1 56 57 Sum of Squares 71534 21698 93232 Mean Square 71534 387.46904 F Value 184.62 Pr > F <.0001

Root MSE Dependent Mean Coeff Var

19.68423 111.03445 17.72804

R-Square Adj R-Sq

0.7673 0.7631

Regression Estimates
Parameter Estimates D F Parameter Estimate 5.43157 Standar d Error 8.19061 95% Confidence Limits

Variable

Label

t Value 0.66

Pr > |t|

Intercept Intercept 1

21.8393 0.5100 10.9761 3 9 <.0001 47.8147 64.3518 3 3

Size

Size

56.08328

4.12758

13.59

Inferences and Prediction


We want to
estimate the mean selling price for houses of size 1750 sq. feet predict the selling price of a new house of size 1750 sq. feet.

Prediction
DATA Xvalue; INPUT size price; CARDS; 1.750 .; DATA homeprice; SET homeprice Xvalue; PROC REG DATA=homeprice; MODEL Price=Size / clm cli; RUN;

New Statements
The options clm and cli will give us Confidence Intervals for the mean of Y, for the values of X (the sizes) in the data set. The CARDS statement lets you quickly append additional lines of data and should be used if you are adding a small amount of data internal to your program In case we had to get the same details (predictions) for a large dataset we would have used the INPUT or IMPORT command

Review
In the 2nd half of the class we will review the entire semester If there are any questions please let me know. If not, I will walk you through the salient points of the course so that you get no surprises on this material in the future.

You might also like