<- mtcars mtcars.data
Introduction
Let’s jump into data manipulation with R! Selecting specific rows from our datasets is an important skill. Today, we’ll focus on subsetting rows by index, using the trusty square brackets ([]
).
First, we’ll load a dataset containing car characteristics:
This code loads the mtcars
dataset (containing car data) into a new variable, mtcars.data
. Now, we’ll explore how to target specific rows.
Examples
Example 1: Selecting a Single Row by Index
Imagine you want to analyze the fuel efficiency (miles per gallon) of a particular car. Here’s how to grab a single row by its index (row number):
# Select the 5th row (remember indexing starts from 1!)
<- mtcars.data[5,]
specific.car specific.car
mpg cyl disp hp drat wt qsec vs am gear carb
Hornet Sportabout 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
Explanation:
mtcars.data
: This is our data frame, containing all the car information.[]
: These are the square brackets, used for subsetting.5
: This is the index of the row we want. Since indexing starts from 1, the 5th row will be selected.,
: The comma tells R to select all columns (everything) from that row.
Try it yourself! Select the 10th row and see what car it represents.
Example 2: Selecting Multiple Rows by Index
Let’s say you’re interested in comparing fuel efficiency (miles per gallon) of a few specific cars. We can use a vector of indices to grab multiple rows at once:
# Select the 3rd, 7th, and 12th rows
<- mtcars.data[c(3, 7, 12),]
few.cars few.cars
mpg cyl disp hp drat wt qsec vs am gear carb
Datsun 710 22.8 4 108.0 93 3.85 2.32 18.61 1 1 4 1
Duster 360 14.3 8 360.0 245 3.21 3.57 15.84 0 0 3 4
Merc 450SE 16.4 8 275.8 180 3.07 4.07 17.40 0 0 3 3
Explanation:
- We use
c()
to create a vector containing the desired row indices: 3, 7, and 12. - Everything else remains the same as the previous example.
Challenge yourself! Create a vector to select the last 5 rows and analyze their horsepower.
Example 3: Selecting Rows Using a Range of Indices
Sometimes, you want to analyze a group of consecutive cars. Here’s how to select a range using the colon (:
) operator:
# Select rows from 8 to 15 (inclusive)
<- mtcars.data[8:15,]
car.slice car.slice
mpg cyl disp hp drat wt qsec vs am gear carb
Merc 240D 24.4 4 146.7 62 3.69 3.19 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.15 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.44 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.44 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 4.07 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.73 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.78 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.25 17.98 0 0 3 4
Explanation:
8:15
: This specifies the range of rows we want. Here, we select from row 8 (inclusive) to row 15 (inclusive).
Now it’s your turn! Select rows 1 to 10 and explore the distribution of the number of cylinders.
Remember, practice is key! Experiment with different indices and ranges to become comfortable with subsetting rows in R. As you work with more datasets, you’ll master these techniques and become a data wrangling pro.
Happy coding!