<- data.frame(
df id = 1:10,
value = c(5, 3, 6, 9, 2, 4, 7, 1, 8, 10)
)
Introduction
Filtering data frames in R is a common task in data analysis. Often we want to subset a data frame to only keep rows that meet certain criteria. A useful filtering technique is keeping rows where a column value falls between two specified values.
In this post, we’ll walk through how to filter rows in R where a column value is between two values using base R syntax.
Filtering with bracket notation
One way to filter rows is by using bracket notation []
and specifying a logical vector.
Let’s create a sample data frame:
We can filter df
to only keep rows where value
is between 5 and 8 with:
$value >= 5 & df$value <= 8,] df[df
id value
1 1 5
3 3 6
7 7 7
9 9 8
This filters for rows where value
is greater than or equal to 5 df$value >= 5
AND less than or equal to 8 df$value <= 8
. The comma after the logical vector tells R to return the filtered rows.
Filtering with subset()
Another option is using the subset()
function:
subset(df, value >= 5 & value <= 8)
id value
1 1 5
3 3 6
7 7 7
9 9 8
subset()
takes a data frame as the first argument, then a logical expression similar to the bracket notation.
Additional examples
We can filter on different columns and value ranges:
# id between 3 and 7
$id >= 3 & df$id <= 7,] df[df
id value
3 3 6
4 4 9
5 5 2
6 6 4
7 7 7
# value less than 5
subset(df, value < 5)
id value
2 2 3
5 5 2
6 6 4
8 8 1
It’s also possible to filter rows outside a range by flipping the logical operators:
# id NOT between 3 and 7
!(df$id >= 3 & df$id <= 7),] df[
id value
1 1 5
2 2 3
8 8 1
9 9 8
10 10 10
# value greater than 5
subset(df, value > 5)
id value
3 3 6
4 4 9
7 7 7
9 9 8
10 10 10
Summary
Filtering data frames where a column is between two values is straightforward in R. The key steps are:
- Use bracket notation
df[logical,]
orsubset(df, logical)
- Create a logical expression with
&
and>=
,<=
operators - Specify the column name and range of values to filter between
I encourage you to try filtering data frames on your own! Subsetting by logical expressions is an important skill for efficient R programming.