The Power of dplyr in R

The Power of dplyr in R - part 2

Let's continue our adventure with dplyr package. In the previous article I introduced select() function which select a subset of columns. Today we will focus on how to pick the observation and add a new column. We will continue using mtcars dataset which is included in your R base program.

Also I would like to say that all the posts I publish here requires basic knowledge of R and R Studio program. If you are totally new in R and don't have it installed on your computer I strongly recommend you to find some on-line tutorials and start with R fundamentals.

library(dplyr)
data("mtcars")
head(mtcars)

Now, let's introduce function:

filter() filter a subset of rows (pick the observation)

head(filter(.data=mtcars,mpg>20 & vs==0)) #filter rows where mpg>20 and vs= 0

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21 6 160.0 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21 6 160.0 110 3.90 2.875 17.02 0 1 4 4

Porsche 914-2 26 4 120.3 91 4.43 2.140 16.70 0 1 5 2

filter() with between() function:

filter(.data=mtcars,between(mpg,18,20)) #filter rows where values of mpg are between 18 and 2

mpg cyl disp hp drat wt qsec vs am gear carb

Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2

Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1

Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4

Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2

Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6

slice() select rows using their position - rows IDs (integer location)

slice(.data = mtcars, 1:3) #choosing first three rows of the mtcars dataset

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4

Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

mutate() compute a new column (add a new variable while keeping the base variables)

head(mutate(.data=mtcars,new=mpg/cyl)) #add a new variable while keeping the base variables

mpg cyl disp hp drat wt qsec vs am gear carb new

1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 3.500000

2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3.500000

3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 5.700000

4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3.566667

5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 2.337500

6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3.016667

transmute() calculates new variables while dropping the base ones.

head(transmute(.data=mtcars,new=mpg/cyl)) #calculates new variables while dropping the base ones.

new
1 3.500000
2 3.500000
3 5.700000
4 3.566667
5 2.337500
6 3.016667

My journey through the data science - by Karolina M'Goma

Search This Blog

The Power of dplyr in R - part 2

Comments

Post a Comment

Popular posts from this blog

Model Residuals in Time Series Data

Random number generators, reproducibility and sampling with dplyr

The Power of dplyr in R - part 3