stackinator
stackinator

Reputation: 5819

R Selecting Vector Elements Syntax Utilizing dplyr

The following code computes standard deviation across rows for the iris dataset.

library(dplyr)
iris %>% mutate(stDev = apply(.[(1:4)], 1, sd))

I cannot wrap my head around the column selection syntax in this code above. Learning R I thought column selection worked in the following manner:

library(dplyr)
iris[, 1:4]

What's with the '.' in the first code block? Are there any other basic examples of column selection using this type of syntax? I understand that the %>% pipe assumes the iris data set for all commands after the pipe. But why does the syntax change?

The column selection [(1:4)] in the first code block is missing the ',' and is wrapped in parenthesis () instead of square brackets [].

Upvotes: 2

Views: 427

Answers (1)

acylam
acylam

Reputation: 18661

Both iris[1:4] and iris[,1:4] are valid ways to subset columns. For example:

> iris[1:4] %>% head()
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1          5.1         3.5          1.4         0.2
2          4.9         3.0          1.4         0.2
3          4.7         3.2          1.3         0.2
4          4.6         3.1          1.5         0.2
5          5.0         3.6          1.4         0.2
6          5.4         3.9          1.7         0.4

> iris[,1:4] %>% head()
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1          5.1         3.5          1.4         0.2
2          4.9         3.0          1.4         0.2
3          4.7         3.2          1.3         0.2
4          4.6         3.1          1.5         0.2
5          5.0         3.6          1.4         0.2
6          5.4         3.9          1.7         0.4

() doesn't do anything in this case. The . in apply(.[(1:4)], 1, sd) is just a piping syntax that says "put whatever the output from before the pipe to this location". So in this case, iris is piped to both the first argument of mutate (the default), and to .[(1:4)], which evaluates to iris[(1:4)]. The following gives the same output:

> iris %>% mutate(stDev = apply(.[1:4], 1, sd)) %>% head
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    stDev
1          5.1         3.5          1.4         0.2  setosa 2.179449
2          4.9         3.0          1.4         0.2  setosa 2.036950
3          4.7         3.2          1.3         0.2  setosa 1.997498
4          4.6         3.1          1.5         0.2  setosa 1.912241
5          5.0         3.6          1.4         0.2  setosa 2.156386
6          5.4         3.9          1.7         0.4  setosa 2.230844

> iris %>% mutate(stDev = apply(iris[1:4], 1, sd)) %>% head
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    stDev
1          5.1         3.5          1.4         0.2  setosa 2.179449
2          4.9         3.0          1.4         0.2  setosa 2.036950
3          4.7         3.2          1.3         0.2  setosa 1.997498
4          4.6         3.1          1.5         0.2  setosa 1.912241
5          5.0         3.6          1.4         0.2  setosa 2.156386
6          5.4         3.9          1.7         0.4  setosa 2.230844

Upvotes: 2

Related Questions