# Main script
# Script setup --------------------------------------
# Load box modules
::use(. / box / global_options / global_options)
box::use(. / box / io / imports)
box::use(. / box / io / exports)
box::use(. / box / mod / mod)
box
# Load global options
$set_global_options()
global_options
# Main script ---------------------------------------
# Load data, process it, and export results
<- getOption('data_dir') |>
all_data
# Load all data
$load_all() |>
imports
# Modify dataset
$modify_data() |>
mod
# Export data
$export_data() exports
Introduction
Today I am going to make a short post on the R package {box}
which was showcased to me quite nicely by Michael Miles. It was informative and I was able to immediately see the usefulness of the {box}
library.
So what is ‘box’? Well here is the description straight from their site:
‘box’ allows organising R code in a more modular way, via two mechanisms:
- It enables writing modular code by treating files and folders of R code as independent (potentially nested) modules, without requiring the user to wrap reusable code into packages.
- It provides a new syntax to import reusable code (both from packages and from modules) which is more powerful and less error-prone than library or require, by limiting the number of names that are made available.
So let’s see how it all works.
Function
The main portion of the script looks like this:
So what does this do? Well it is grabbing data from a predefined location, modifying it and then re-exporting it. Now let’s look at all the code that is behind it, which allows us to do these things and then you will see the power of using box
Example
Let’s take a look at the global options settings.
# Set global options
#' @export
<- function() {
set_global_options options(
look_ups = 'look-ups/',
data_dir = 'data/input/'
) }
Ok 6 lines, boxed down to one.
Now the import function.
# Function for importing data
#' @export
<- function(file_path) {
load_all
::use(purrr)
box::use(vroom)
box
|>
file_path
# Get all csv files from folder
list.files(full.names = TRUE) |>
# Set list names
$set_names(\(file) basename(file)) |>
purrr
# Load all csvs into list
$map(\(file) vroom$vroom(file))
purrr
}
Now the modify_data
function.
# Function for modifying data
#' @export
<- function(df_list) {
modify_data
::use(dplyr)
box::use(purrr)
box
<- function(df) {
map_fun
|>
df $select(name:mass) |>
dplyr$mutate(lol = height * mass) |>
dplyr$filter(lol > 1500)
dplyr
}
# Apply mapping function to list
$map(df_list, map_fun)
purrr
}
Ok again, a big savings here, instead of the above we simply call mod$modify_data()
which makes things clearner and also modular in that we can go to a very specific spot in our proejct to fix an error or add/subtract functionality.
Lastly the export.
# Function for exporting data
#' @export
<- function(df_list) {
export_data
::use(vroom)
box::use(purrr)
box
# Export data
$map2(.x = df_list,
purrr.y = names(df_list),
~vroom$vroom_write(x = .x,
file = paste0('data/output/',
.y),delim = ','))
}
Voila! I think to even a fresh user, the power of boxing your functions is fairly apparent and to the advanced user, eyes are most likely glowing!