Introduction and Needed Packages
Learn about some basic regression techniques and how to get started with regression in R.
We'll cover the following
Now that we’re equipped with data visualization skills, data wrangling skills, and an understanding of how to import data and the concept of a tidy data format, let’s now proceed with data modeling. The fundamental premise of data modeling is to make explicit the relationship between:
An outcome variable
, also called a dependent variable or response variable An explanatory/predictor variable
, also called an independent variable or covariate.
Another way to state this is using mathematical terminology. We’ll model the outcome variable ggplot()
function, but rather as a mathematical function. However, why do we have two different labels, explanatory and predictor, for the variable
Modeling for explanation: When we want to explicitly describe and quantify the relationship between the outcome variable
and a set of explanatory variables , we have to determine the significance of any relationships. We have measures that’ll summarize these relationships, and possibly identify any causal relationships between the variables. Modeling for prediction: When we want to predict an outcome variable
based on the information contained in a set of predictor variables . However, unlike modeling for explanation, we don’t care so much about understanding how all the variables relate and interact with one another. Our focus is on whether we can make good predictions about using the information in .
Get hands-on with 1400+ tech skills courses.