library(tidymodels)
<- recipe(mpg ~ ., data = mtcars)
rec_obj rec_obj
Introduction
In this post, we are using a package called tidymodels
, which provides a suite of tools for modeling and machine learning.
Now, let’s take a closer look at the code itself and how we extract a model call from a fitted workflow
object.
The first line loads the tidymodels package. Then, we create a “recipe” object called rec_obj
using the recipe()
function. A recipe is a set of instructions for preparing data for modeling. In this case, we are telling the recipe to use the mpg variable as the outcome or dependent variable, and all other variables in the mtcars dataset as the predictors or independent variables.
<- linear_reg(mode = "regression", engine = "lm")
model_spec model_spec
Linear Regression Model Specification (regression)
Computational engine: lm
Next, we create a “model specification” object called model_spec
using the linear_reg()
function. This specifies the type of model we want to use, which is a linear regression model in this case. We also specify that the model is a regression (i.e., we are predicting a continuous outcome variable) and that the model engine is “lm”, which stands for “linear model”.
<- workflow() |>
wflw add_recipe(rec_obj) |>
add_model(model_spec)
wflw
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()
── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps
── Model ───────────────────────────────────────────────────────────────────────
Linear Regression Model Specification (regression)
Computational engine: lm
In the next section of code, we create a “workflow” object called wflw
using the workflow()
function. A workflow is a way of organizing the steps involved in building a machine learning model. In this case, we are using a “pipe” (|>) to sequentially add the recipe and model specification to the workflow. This means that we first add the recipe to the workflow using the add_recipe()
function, and then add the model specification using the add_model()
function.
<- fit(wflw, data = mtcars)
wflw_fit wflw_fit
══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()
── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps
── Model ───────────────────────────────────────────────────────────────────────
Call:
stats::lm(formula = ..y ~ ., data = data)
Coefficients:
(Intercept) cyl disp hp drat wt
12.30337 -0.11144 0.01334 -0.02148 0.78711 -3.71530
qsec vs am gear carb
0.82104 0.31776 2.52023 0.65541 -0.19942
Finally, we fit the workflow to the data using the fit() function, which takes the workflow object (wflw) and the data (mtcars) as input. This creates a new object called wflw_fit
, which is the fitted model object. This object contains various pieces of information about the fitted model, such as the model coefficients and the R-squared value.
$fit$fit$fit$call wflw_fit
stats::lm(formula = ..y ~ ., data = data)
The last line of code extracts the actual function call that was used to fit the model. This can be useful for reproducing the analysis later on.
Overall, the code you shared shows how to build a simple linear regression model using the tidymodels package in R. We start by creating a recipe that specifies the outcome variable and predictor variables, then create a model specification for a linear regression model, and finally combine these into a workflow and fit the model to the data.