named_item_list(.data, .group_col)
Introduction
Many times when we are working with a data set we will want to break it up into groups and place them into a list and work with them in that fashion. With this it can be useful to the elements of the list named by the column that the data was split upon. Let’s use the iris set as an example where we split on Species
.
There are two main functions that we will use in this scenario, namely purrr:map()
and dplyr::group_split()
, you could also use the split
function from base r
for this.
We will also go over how simple this is using the {healthyR}
package. Let’s look at the function from {healthyR}
Function
Full function call.
There are only two arguments to supply.
.data
- The data.frame/tibble..group_col
- The column that contains the groupings.
That’s it.
Examples
Let’s jump into it.
library(purrr)
library(dplyr)
<- iris
data_tbl
<- data_tbl %>%
data_tbl_list group_split(Species)
data_tbl_list
<list_of<
tbl_df<
Sepal.Length: double
Sepal.Width : double
Petal.Length: double
Petal.Width : double
Species : factor<fb977>
>
>[3]>
[[1]]
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# … with 40 more rows
[[2]]
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 7 3.2 4.7 1.4 versicolor
2 6.4 3.2 4.5 1.5 versicolor
3 6.9 3.1 4.9 1.5 versicolor
4 5.5 2.3 4 1.3 versicolor
5 6.5 2.8 4.6 1.5 versicolor
6 5.7 2.8 4.5 1.3 versicolor
7 6.3 3.3 4.7 1.6 versicolor
8 4.9 2.4 3.3 1 versicolor
9 6.6 2.9 4.6 1.3 versicolor
10 5.2 2.7 3.9 1.4 versicolor
# … with 40 more rows
[[3]]
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 6.3 3.3 6 2.5 virginica
2 5.8 2.7 5.1 1.9 virginica
3 7.1 3 5.9 2.1 virginica
4 6.3 2.9 5.6 1.8 virginica
5 6.5 3 5.8 2.2 virginica
6 7.6 3 6.6 2.1 virginica
7 4.9 2.5 4.5 1.7 virginica
8 7.3 2.9 6.3 1.8 virginica
9 6.7 2.5 5.8 1.8 virginica
10 7.2 3.6 6.1 2.5 virginica
# … with 40 more rows
%>%
data_tbl_list map( ~ pull(., Species)) %>%
map( ~ as.character(.)) %>%
map( ~ unique(.))
[[1]]
[1] "setosa"
[[2]]
[1] "versicolor"
[[3]]
[1] "virginica"
Now lets go ahead and apply the names.
names(data_tbl_list) <- data_tbl_list %>%
map( ~ pull(., Species)) %>%
map( ~ as.character(.)) %>%
map( ~ unique(.))
data_tbl_list
<list_of<
tbl_df<
Sepal.Length: double
Sepal.Width : double
Petal.Length: double
Petal.Width : double
Species : factor<fb977>
>
>[3]>
$setosa
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# … with 40 more rows
$versicolor
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 7 3.2 4.7 1.4 versicolor
2 6.4 3.2 4.5 1.5 versicolor
3 6.9 3.1 4.9 1.5 versicolor
4 5.5 2.3 4 1.3 versicolor
5 6.5 2.8 4.6 1.5 versicolor
6 5.7 2.8 4.5 1.3 versicolor
7 6.3 3.3 4.7 1.6 versicolor
8 4.9 2.4 3.3 1 versicolor
9 6.6 2.9 4.6 1.3 versicolor
10 5.2 2.7 3.9 1.4 versicolor
# … with 40 more rows
$virginica
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 6.3 3.3 6 2.5 virginica
2 5.8 2.7 5.1 1.9 virginica
3 7.1 3 5.9 2.1 virginica
4 6.3 2.9 5.6 1.8 virginica
5 6.5 3 5.8 2.2 virginica
6 7.6 3 6.6 2.1 virginica
7 4.9 2.5 4.5 1.7 virginica
8 7.3 2.9 6.3 1.8 virginica
9 6.7 2.5 5.8 1.8 virginica
10 7.2 3.6 6.1 2.5 virginica
# … with 40 more rows
Let’s now see how we do this in {healthyR}
library(healthyR)
named_item_list(iris, Species)
<list_of<
tbl_df<
Sepal.Length: double
Sepal.Width : double
Petal.Length: double
Petal.Width : double
Species : factor<fb977>
>
>[3]>
$setosa
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# … with 40 more rows
$versicolor
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 7 3.2 4.7 1.4 versicolor
2 6.4 3.2 4.5 1.5 versicolor
3 6.9 3.1 4.9 1.5 versicolor
4 5.5 2.3 4 1.3 versicolor
5 6.5 2.8 4.6 1.5 versicolor
6 5.7 2.8 4.5 1.3 versicolor
7 6.3 3.3 4.7 1.6 versicolor
8 4.9 2.4 3.3 1 versicolor
9 6.6 2.9 4.6 1.3 versicolor
10 5.2 2.7 3.9 1.4 versicolor
# … with 40 more rows
$virginica
# A tibble: 50 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 6.3 3.3 6 2.5 virginica
2 5.8 2.7 5.1 1.9 virginica
3 7.1 3 5.9 2.1 virginica
4 6.3 2.9 5.6 1.8 virginica
5 6.5 3 5.8 2.2 virginica
6 7.6 3 6.6 2.1 virginica
7 4.9 2.5 4.5 1.7 virginica
8 7.3 2.9 6.3 1.8 virginica
9 6.7 2.5 5.8 1.8 virginica
10 7.2 3.6 6.1 2.5 virginica
# … with 40 more rows
If you use this in conjunction with the healthyR function save_to_excel()
then it will write an excel file with a tab for each named item in the list.
Voila!