How to Split a Number into Digits in R Using gsub() and strsplit()

code
rtip
operations
Author

Steven P. Sanderson II, MPH

Published

May 22, 2024

Introduction

Splitting numbers into individual digits can be a handy trick in data analysis and manipulation. Today, we’ll explore how to achieve this using base R functions, specifically gsub() and strsplit(). Let’s walk through the process step by step, explain the syntax of each function, and provide some examples for clarity.

Syntax

Understanding gsub() and strsplit()

First, let’s get familiar with the two main functions we’ll be using:

  1. gsub(pattern, replacement, x):
    • pattern: A regular expression describing the pattern to be matched.
    • replacement: The string to replace the matched pattern.
    • x: The input vector, which is usually a character string.

The gsub() function replaces all occurrences of the pattern in x with the replacement.

  1. strsplit(x, split):
    • x: The input vector, which is usually a character string.
    • split: The delimiter on which to split the input string.

The strsplit() function splits the elements of a character vector x into substrings based on the delimiter specified in split.

Examples

Splitting a Number into Digits

Let’s go through a few examples to see how we can split numbers into digits using these functions.

Example 1: Basic Splitting of a Single Number

# Step 1: Convert the number to a character string
number <- 12345
number_str <- as.character(number)
number_str
[1] "12345"
# Step 2: Use gsub() to insert a delimiter (space) between each digit
number_with_spaces <- gsub("(.)", "\\1 ", number_str)
number_with_spaces
[1] "1 2 3 4 5 "
# Step 3: Use strsplit() to split the string on the delimiter
digits <- strsplit(number_with_spaces, " ")[[1]]

# Step 4: Convert the result back to numeric
digits_numeric <- as.numeric(digits)

# Print the result
print(digits_numeric)
[1] 1 2 3 4 5

Explanation:

  1. We convert the number to a character string using as.character().
  2. We use gsub("(.)", "\\1 ", number_str) to insert a space between each digit. The pattern (.) matches any character, and \\1 refers to the matched character followed by a space.
  3. We split the string on spaces using strsplit(number_with_spaces, " ").
  4. Finally, we convert the resulting character vector back to numeric using as.numeric().

Example 2: Splitting Multiple Numbers in a Vector

# Vector of numbers
numbers <- c(6789, 5432)

# Function to split a single number into digits
split_number <- function(number) {
  number_str <- as.character(number)
  number_with_spaces <- gsub("(.)", "\\1 ", number_str)
  digits <- strsplit(number_with_spaces, " ")[[1]]
  as.numeric(digits)
}

# Apply the function to each number in the vector
split_digits <- lapply(numbers, split_number)

# Print the result
print(split_digits)
[[1]]
[1] 6 7 8 9

[[2]]
[1] 5 4 3 2

Explanation:

  1. We define a vector of numbers.
  2. We create a function split_number that takes a number and splits it into digits using the same steps as in Example 1.
  3. We apply this function to each number in the vector using lapply().
  4. The result is a list where each element is a vector of digits for each number in the original vector.

Try It Yourself!

Now that we’ve gone through the examples, it’s your turn to give it a try! Experiment with different numbers, vectors, and even customize the splitting function to handle special cases or additional formatting. The more you practice, the more comfortable you’ll become with these handy base R functions.

Happy Coding!