Introduction
Many times in modeling we want to get the uncertainty in the model, well, bootstrapping to the rescue!
I am going to go over a very simple example on how to use purrr and modelr for this situation. We will use the mtcars dataset.
Examples
Let’s get right into it.
library(tidyverse)
library(tidymodels)
df <- mtcars
fit_boots <- df %>% 
  modelr::bootstrap(n = 200, id = 'boot_num') %>%
  group_by(boot_num) %>%
  mutate(fit = map(strap, ~lm(mpg ~ ., data = data.frame(.))))
fit_boots
# A tibble: 200 × 3
# Groups:   boot_num [200]
   strap                boot_num fit   
   <list>               <chr>    <list>
 1 <resample [32 x 11]> 001      <lm>  
 2 <resample [32 x 11]> 002      <lm>  
 3 <resample [32 x 11]> 003      <lm>  
 4 <resample [32 x 11]> 004      <lm>  
 5 <resample [32 x 11]> 005      <lm>  
 6 <resample [32 x 11]> 006      <lm>  
 7 <resample [32 x 11]> 007      <lm>  
 8 <resample [32 x 11]> 008      <lm>  
 9 <resample [32 x 11]> 009      <lm>  
10 <resample [32 x 11]> 010      <lm>  
# … with 190 more rows
 
 
Now lets get our parameter estimates.
# get parameters ####
params_boot <- fit_boots %>%
  mutate(tidy_fit = map(fit, tidy)) %>%
  unnest(cols = tidy_fit) %>%
  ungroup()
# get predictions
preds_boot <- fit_boots %>%
  mutate(augment_fit = map(fit, augment)) %>%
  unnest(cols = augment_fit) %>%
  ungroup()
 
Time to visualize.
library(patchwork)
# plot distribution of estimated parameters
p1 <- ggplot(params_boot, aes(estimate)) +
  geom_histogram(col = 'black', fill = 'white') +
  facet_wrap(~ term, scales = 'free') +
  theme_minimal()
# plot points with predictions
p2 <- ggplot() +
  geom_line(aes(mpg, .fitted, group = boot_num), preds_boot, alpha = .03) +
  geom_point(aes(mpg, .fitted), preds_boot, col = 'steelblue', alpha = 0.05) +
  theme_minimal()
  
# plot both
p1 + p2
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
 
 
Voila!