联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-20:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2022-05-02 04:20

Multiple Linear Regression Models - Part 2

Residual Diagnostics, Unusual observations

STAT3022

Applied linear models

Regression Diagnostics

Background

Recall the MLR model

y = Xβ + ε, E(y) = Xβ, Var(y) = Var(ε) = σ2In

Assuming the design matrix X is full-ranked,

Background

Similar to model diagnostics for SLR, diagnostic for MLR is based

on the residuals, which depends critically on the hat matrix H.

H is symmetric, i.e H> = H. As a result, the matrix In ?H

is also symmetric.

Next, HX = X. As a result, (In?H)X = X?X = 0.

Third, H2 = H, so we say H is idempotent. As a result, the

matrix In ?H is also idempotent.

Finally, as proved in the Tutorial 4, trace(H) =

∑n

i=1 hii = p.

2

Residual vector

? First, let’s compute its expectation:

E(e) = E {(In?H)y} = (In?H)E(y) = (In?H)Xβ = 0.

? Second, let’s compute the variance-covariance matrix.

Var(e) = Var {(In?H)y} = (In?H) Var(y)(In?H)>

= (In?H)σ2 In(In?H) = σ2(In?H)(In?H)

= σ2(In?H),

i.e Var(ei) = σ

2(1? hii), Cov(ei, ej) = ?σ2hij .

These computation tell us that (1) each residual term ei has a

smaller variance than the true error εi, and (2) these residuals are

correlated.

3

Residuals plots

We can use similar residual plots similar to in the case of simple

linear regression for model diagnostics. Specifically,

To check constant variance assumption: Use the plot of

residual ei vs. fitted values y?i or the plot of residual vs. each

covariate. no news is good news.

To check normality assumption: Use normal quantile-quantile

plot, or normality test.

4

A reasonable constant-variance

A distinct characteristic of MLR compared to SLR is that they

have more than one predictor. As such, the intercorrelation

between predictors play important roles in the estimated

coefficients as well as inference of the MLR.

Such intercorrelation is known as multicollinearity (multi:

many; collinear: linear dependence).

We will study three cases:

1. When all predictors are uncorrelated.

2. When all predictors are perfectly correlated.

3. When all predictors are correlated but not perfectly correlated.

In this section, we denote rjk as the sample correlation

between two predictors Xj and Xk.

1

Uncorrelated predictors

Consider the models

yi = β0 + β1xi1 + β2xi2 + εi (1)

yi = β0 + β1xi1 + εi (2)

yi = β0 + β2xi2 + εi (3)

If r12 = 0 (i.e X1 and X2 are uncorrelated), then

The OLS estimates for β1 of model (1) and model (2) are

exactly the same.

The OLS estimates for β2 of model (1) and model (3) are

exactly the same.

SSR(X1, X2) = SSR(X1) + SSR(X2)

2

An example: Kutner et al. (Table 7.6)

Example: effect of work crew size (X1) and level of bonus pay

(X2) on crew productivity (Y). X1 and X2 are uncorrelated.

3

An example: Kutner et al. (Table 7.6)

4

Uncorrelated predictors

In general, if all p? 1 predictors are mutually uncorrelated:

The effect of one predictor on the response does not depend

on whether these other predictors are in the model.

Hence, we can get the effect of one predictor Xj on the

response Y just by fitting SLR of Xj and Y .

We do not go into the math of this conclusion, but intuitively,

when all the predictors are uncorrelated, they have “separate”

effects on the response.

You will see this case again when we talk about experimental

designs.

5

Perfectly correlated predictors

The second (extreme) case is when one or some predictors are

perfectly correlated with one another.

Essentially, that just means one predictor can be written as the

linear combination of some other predictor variables. In this

case, the design matrix X is not full-ranked, i.e rank(X) < p.

Recall the normal equation for OLS:

X>Xb = X> y

and rank(X>X) = rank(X). Hence, in this case, the matrix

X>X is also not full-ranked, and we will have infinite

number of solutions for b.

Perfectly correlated predictors

Though we have infinitely number of solutions for b, all

solutions give the same fitted values (and residuals).

Therefore, while there is no interpretation for b, the model

can still provide a good fit for the data.

9

Highly correlated predictors

Although these above cases are extreme, in reality, it is very

common to find many predictors are highly correlated. At the end,

highly correlated variables are inherent characteristics of the

population of interest.

Example: Regression of food expenditures on income, savings,

age of head of household, educational level, etc., all the

predictors are correlated with one another.

Mathematically, although the design matrix X and the matrix

X>X still have the full rank, the inversion V = (X>X)?1

become unstable.

Recall that Var(β?) = σ2V, so multicollinearity inflates the

variance of the OLS estimator.


版权所有:留学生编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。