Conditions for Inference for Regression-I

Learn about residuals, linearity, and independence in inference for regression.

We'll cover the following

Residuals refresher
Linearity of relationship
Independence of residuals

We stated that we can only use the standard-error-based method for constructing confidence intervals if the bootstrap distribution is bell-shaped. Similarly, there are certain conditions that need to be met in order for the results of our hypothesis tests and confidence intervals to have valid meaning. These conditions must be met for the assumed underlying mathematical and probability theory to hold true.

For inference for regression, there are four conditions that need to be met. Note the first four letters of these conditions—LINE—can serve as a nice reminder of what to check for whenever we perform linear regression.

Linearity of the relationship between variables
Independence of the residuals
Normality of the residuals
Equality of variance of the residuals

Conditions L, N, and E can be verified through what’s known as a residual analysis. Condition I can only be verified through an understanding of how the data was collected.

We’ll go over a refresher on residuals, verify whether each of the four LINE conditions holds true, and then discuss the implications.

Residuals refresher

Recall the definition of a residual: the observed value minus the fitted value denoted by $y - \hat{y}$ . Recall that residuals can be thought of as the error or the lack-of-fit between the observed value $y$ and the fitted value $\hat{y}$ on the regression line. In the figure below, we illustrate one particular residual out of the 463 using an arrow. We’ve also illustrated its corresponding observed and fitted values using a circle and a square, respectively:

Get hands-on with 1400+ tech skills courses.

Getting Started with Data in R

Data Visualization

Data Wrangling

Data Importing and “Tidy” Data

Basic Regression

Multiple Regression

Statistical Inference with the infer Package

Bootstrapping and Confidence Intervals

Hypothesis Testing

Inference for Regression

Price Prediction With Regression Analysis in R

Tell a Story with Data

Appendix

Uber Data Analysis Using the R Language

Conditions for Inference for Regression-I

Residuals refresher