Bix
Bix

Reputation: 335

Why does sapply() return a list?

I faced up a strange behaviour in R with the sapply() function. This function is supposed to return a vector, but in the special case where you give it an empty vector, it returns a list.

Correct behaviour with a vector:

a = c("A", "B", "C")
a[a == "B"]  # Returns "B"
a[sapply(a, function(x) {x == "B"})] # Returns "B"

Correct behaviour with a NULL value:

a = NULL
a[a == "B"]  # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Returns NULL

Strange behaviour with an empty vector:

a = vector()
a[a == "B"]  # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Erreur : type 'list' d'indice incorrect

Same error message as with this statement:

a[list()] # Erreur dans a[list()] : type 'list' d'indice incorrect

Why? Is it a bug?

Due to this strange behaviour, I use unlist(lapply()).

Upvotes: 20

Views: 18657

Answers (3)

Tommy
Tommy

Reputation: 40813

The real reason for this is that sapply doesn't know what your function will return without calling it. In your case the function returns a logical, but since sapply is given an empty list, the function is never called. Therefore, it has to come up with a type and it defaults to list.

...For this very reason (and for performance), vapply was introduced! It requires you to specify the return value type (and length). This allows it to do the right thing. As a bonus, it is also faster!

sapply(LETTERS[1:3], function(x) {x == "B"}) # F, T, F
sapply(LETTERS[0], function(x) {x == "B"})   # list()

vapply(LETTERS[1:3], function(x) {x == "B"}, logical(1)) # F, T, F
vapply(LETTERS[0], function(x) {x == "B"}, logical(1))   # logical()

See ?vapply for more info.

Upvotes: 22

Gavin Simpson
Gavin Simpson

Reputation: 174778

The help for the function ?sapply has this in the Value section

For ‘sapply(simplify = TRUE)’ and ‘replicate(simplify = TRUE)’: if
‘X’ has length zero or ‘n = 0’, an empty list.

In both your cases:

> length(NULL)
[1] 0
> length(vector())
[1] 0

Hence sapply() returns:

> sapply(vector(), function(x) {x == "B"})
list()
> sapply(NULL, function(x) {x == "B"})
list()

Your error is not from sapply() but from [ as this shows:

> a[list()]
Error in a[list()] : invalid subscript type 'list'

So the issue is related to how subsetting of NULL and an empty vector (vector()) is performed. Nothing to do with sapply() at all. In both cases it returns consistent output, an empty list.

Upvotes: 7

nograpes
nograpes

Reputation: 18323

Actually, they both return a list. The only difference between the two is the when you try to index NULL it always returns NULL (even if your index was a list), but when you try to index an empty vector, it checks the index, and realizes it is a list.

a = NULL
res = sapply(a, function(x) x == "B") # Res is an empty list
a[res] # returns NULL, because any index of NULL is NULL.


a = vector()
res = sapply(a, function(x) x == "B") # Still an empty list.
a[res] # but you can't index a vector with a list!

Upvotes: 2

Related Questions