Simulation-Based Inference for Regression

Learn about the confidence interval for a slope in simulation-based inference for regression.

When we interpreted the third through seventh columns of a regression table, we stated that R doesn’t do simulations to compute these values. Rather, R uses theory-based methods that involve mathematical formulas.

We’ll use the simulation-based methods to recreate the values in the regression table. In particular, we’ll use the infer package workflow to:

  • Construct a 95% confidence interval for the population slope β1\beta_1 using bootstrap resampling with replacement. We did this with the pennies data and with the mythbusters_yawn data.

  • Conduct a hypothesis test of H0 : β1=0H_0\space : \space \beta_1=0 vs. HA : β10H_A\space : \space \beta_1 \neq0 using a permutation test.

Confidence interval for a slope

We’ll construct a 95% confidence interval for β1\beta_1 using the infer workflow. Specifically, we’ll first construct the bootstrap distribution for the fitted slope b1b_1 using our single sample of 463 courses:

  • We specify() the variables of interest in evals_ch5 with the formula, score ~ bty_avg.

  • We generate() replicates by using bootstrap resampling with replacement from the original sample of 463 courses. We generate reps = 1000 replicates using type = "bootstrap".

  • We calculate() the summary statistic of interest that’s the fitted slope b1b_1.

Using this bootstrap distribution, we’ll construct the 95% confidence interval using the percentile method and (if appropriate) the standard error method as well. It’s important to note in this case that the bootstrapping with replacement is done row-by-row. Therefore, the original pairs of score and bty_avg values are always kept together, but different pairs of score and bty_avg values might be resampled multiple times. The resulting confidence interval will denote a range of plausible values for the unknown population slope β1\beta_1, quantifying the relationship between teaching and beauty scores for all professors at UT Austin. Let’s first construct the bootstrap distribution for the fitted slope b1b_1:

Get hands-on with 1400+ tech skills courses.