Mastering String Concatenation of Vectors in R: Base R, stringr, stringi, and glue

code
rtip
strings
stringr
stringi
glue
Author

Steven P. Sanderson II, MPH

Published

August 13, 2024

Introduction

Welcome to another exciting R programming tutorial! Today, we will explore how to concatenate vectors of strings using different methods in R: base R, stringr, stringi, and glue. We’ll use a practical example involving a data frame with names, job titles, and salaries. By the end of this post, you’ll feel confident using these tools to manipulate and combine strings in your own projects. Let’s get started!

Our Example Data Frame

We’ll start with a simple data frame containing employee names, their job titles, and their salaries.

# Creating the data frame
employees <- data.frame(
  Name = c("Alice", "Bob", "Charlie"),
  JobTitle = c("Data Scientist", "Software Engineer", "Product Manager"),
  Salary = c(120000, 110000, 105000)
)

print(employees)
     Name          JobTitle Salary
1   Alice    Data Scientist 120000
2     Bob Software Engineer 110000
3 Charlie   Product Manager 105000

Concatenation Using Base R

In base R, we can concatenate strings using the paste() and paste0() functions. The paste() function combines strings with a specified separator, while paste0() does the same without any separator.

To create a single string for each employee that combines their name, job title, and salary, we can use paste():

# Concatenating using paste()
employees$Summary <- paste(
  employees$Name, 
  "is a", employees$JobTitle, 
  "earning $", employees$Salary
  )

print(employees$Summary)
[1] "Alice is a Data Scientist earning $ 120000"   
[2] "Bob is a Software Engineer earning $ 110000"  
[3] "Charlie is a Product Manager earning $ 105000"

The paste() function automatically adds a space between the elements. If you want to control the separator, you can use the sep parameter. For instance:

# Concatenating with a custom separator
employees$Summary <- paste(employees$Name, employees$JobTitle, employees$Salary, sep = " | ")

print(employees$Summary)
[1] "Alice | Data Scientist | 120000"    "Bob | Software Engineer | 110000"  
[3] "Charlie | Product Manager | 105000"

Concatenation Using stringr

The stringr package provides a more consistent and user-friendly approach to string manipulation. The str_c() function is used for concatenation.

First, install and load the stringr package:

# Install if you do not have it
# install.packages("stringr")
library(stringr)

Now, let’s concatenate the strings using str_c():

# Concatenating using str_c()
employees$Summary <- str_c(
  employees$Name, 
  "is a", employees$JobTitle, "earning $", 
  employees$Salary, 
  sep = " "
  )

print(employees$Summary)
[1] "Alice is a Data Scientist earning $ 120000"   
[2] "Bob is a Software Engineer earning $ 110000"  
[3] "Charlie is a Product Manager earning $ 105000"

The str_c() function works similarly to paste(), but with a consistent syntax and more intuitive parameter names.

Concatenation Using stringi

The stringi package is another powerful tool for string manipulation. It offers a wide range of functions, including stri_c() for concatenation.

First, install and load the stringi package:

# Install if you do not have it
# install.packages("stringi")
library(stringi)

Now, let’s concatenate the strings using stri_c():

# Concatenating using stri_c()
employees$Summary <- stri_c(
  employees$Name, 
  "is a", employees$JobTitle, 
  "earning $", employees$Salary, 
  sep = " "
  )

print(employees$Summary)
[1] "Alice is a Data Scientist earning $ 120000"   
[2] "Bob is a Software Engineer earning $ 110000"  
[3] "Charlie is a Product Manager earning $ 105000"

The stri_c() function is similar to str_c() from the stringr package, but it provides additional features for advanced string manipulation.

Concatenation Using glue

The glue package offers a unique approach to string concatenation by allowing you to embed R expressions directly within strings.

First, install and load the glue package:

# Install if you do not have it
# install.packages("glue")
library(glue)

Now, let’s use glue() to create the summary strings:

# Concatenating using glue()
employees$Summary <- glue(
  "{employees$Name} is a {employees$JobTitle} earning ${employees$Salary}"
  )

print(employees$Summary)
Alice is a Data Scientist earning $120000
Bob is a Software Engineer earning $110000
Charlie is a Product Manager earning $105000

The glue() function makes it easy to embed variable values within strings, providing a clear and readable syntax. It also has in my opinion the nicest output as you will notice there is no space between the salary and the dollar sign.

Conclusion

We’ve covered several methods for concatenating strings in R, including base R functions, the stringr package, the stringi package, and the glue package. Each method has its own strengths and can be useful depending on your specific needs.

I encourage you to try these techniques in your own projects. Experimenting with different methods will help you understand which one works best for your use cases.


Happy coding!