You are on page 1of 17

Regression Analysis

By:
Vaibhav Sahu

Regression

What
Why
How
Example Using R

What

Y f(X)
Y - Dependent on X
Y Dependent Variable, Response Variable, Effect
X Independent or Predictor Variable, Cause
Regression is all about creating a valid
relationship (mathematical equation or model)
between dependent and independent variables
Example: Relationship between demand of a
product and the price of the same (Price
Elasticity)

Why
Prediction, Forecasting
Understand Population response based on
sample
Time series analysis (Trend Analysis) to
know the future
Cause Effect Relation (but not necessary)
Applied in various domain like economics,
management, social sciences etc.

Classification
Linear Regression
Y = 2+3*x
Non Linear Regression
Y = 2+3*x+x^2

Linear Regression
Classification
Univariate : One Dependent variable(Y)
Y=a+bX

Multivariate: Two or more Dependent variables


Simple: Only on Independent Variable (X)
Y=a+bX

Multple: Two or more Independent variables


(X1, X2,X2..)
Y = a+b1X1+b2X2+b3X3

Logistic Regression When Dependent


Variable is qualitative (Yes/No, M/F etc)

Linearity
Linearity in Variables
Y = a+bX
Y a bX cX 2 de X
(Non Linear in
Variables but Linear in Parameters)

Linearity in Parameters
Y = a+bX

Y a bX cX 2 de X
Y a b2 X
(Non Linear in parameters but
Linear in Variables)

Population Regression
Function
Passes through condition means E(Y|
X i)
E(Y|Xi) = f(Xi)
Considering
E (Y | X ) Linearity
X
i

Yi 1 2 X i u i
Y

(continued)

Y f(x)
ui

Slope = 1
Random Error for this
x value

Predicted
Value
of Y for Xi

Intercept
= 0

xi

Individual
person's
marks

X
10

Sample Regression Function


We dont have population
instead we have random
samples
For a sample from the
population, SRF is created

Yi 1 2 X i
For many such samples
many
SRFs
can
be
created

u
Yi Y
i
i
(continued)

Yi 1 2 X i ui

Observed
Value
of y for xi

ui

Slope = 1
Random Error for this
x value

Predicted
Value
of Y for Xi

Intercept
= 0

xi

X
12

Our Objective
So our objective is to
Estimate the PRF

Yi 1 2 X i u i

On the basis of SRF

Yi 1 2 X i ui

We have to construct an SRF to


mirror PRF as faithfully as possible or
the best approximation

Ordinary Least Square


From our SRF equation we have:
ui Yi 1 2 X i

u i is stochastic error term or


Residual
We construct SRF to that it is as close
as possible
to Y,(Ywhich
Min u
Y ) means

But Min
we dou 2 (Y Y ) 2
i i i

Min ui

(Yi Yi )

Min ui

(Yi 1 2 X i )

On solving these equation for parameters

( X X )(Y Y )
(X X )
i

and
1 Y 2 X
1 and2 are called least square

estimators of population estimators

Example Using R
Regression.R

Regression.R

Residual Plots

You might also like