Two Numerical Explanatory Variables
Learn about two numerical explanatory variables in multiple regression.
We'll cover the following
Let’s now consider multiple regression models where, instead of one numerical and one categorical explanatory variable, we have two numerical explanatory variables. The dataset we’ll use is from the textbook, An Introduction to Statistical Learning with Applications in R (James et al., 2017). Its accompanying ISLR
R package contains the datasets to which the authors apply various machine-learning methods.
One frequently used dataset in this course is the Credit
dataset, where the outcome variable of interest is the credit card debt of 400 individuals. Other variables like income, credit limit, credit rating, and age are included as well. Note that the Credit
data isn’t based on real individuals’ financial information, but rather is a simulated dataset used for educational purposes.
In this lesson, we’ll fit a regression model where we have:
A numerical outcome variable y, the cardholder’s credit card debt
Two explanatory variables:
One numerical explanatory variable x1, which is the cardholder’s credit limit
Another numerical explanatory variable x2, which is the cardholder’s income (in thousands of dollars)
Exploratory data analysis
Let’s load the Credit
dataset. To keep things simple, let’s select()
the subset of the variables we’ll consider in this lesson and save this data in the new data frame credit_ch6
. Notice our slightly different use of the select()
verb here. For example, we’ll select the Balance
variable from Credit
but then save it with a new variable name debt
. We do this because here the term “debt” is easier to interpret than “balance.”
Get hands-on with 1400+ tech skills courses.