Reputation: 335
I faced up a strange behaviour in R with the sapply()
function. This function is supposed to return a vector, but in the special case where you give it an empty vector, it returns a list.
Correct behaviour with a vector:
a = c("A", "B", "C")
a[a == "B"] # Returns "B"
a[sapply(a, function(x) {x == "B"})] # Returns "B"
Correct behaviour with a NULL value:
a = NULL
a[a == "B"] # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Returns NULL
Strange behaviour with an empty vector:
a = vector()
a[a == "B"] # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Erreur : type 'list' d'indice incorrect
Same error message as with this statement:
a[list()] # Erreur dans a[list()] : type 'list' d'indice incorrect
Why? Is it a bug?
Due to this strange behaviour, I use unlist(lapply())
.
Upvotes: 20
Views: 18657
Reputation: 40813
The real reason for this is that sapply
doesn't know what your function will return without calling it. In your case the function returns a logical
, but since sapply
is given an empty list, the function is never called. Therefore, it has to come up with a type and it defaults to list
.
...For this very reason (and for performance), vapply
was introduced! It requires you to specify the return value type (and length). This allows it to do the right thing. As a bonus, it is also faster!
sapply(LETTERS[1:3], function(x) {x == "B"}) # F, T, F
sapply(LETTERS[0], function(x) {x == "B"}) # list()
vapply(LETTERS[1:3], function(x) {x == "B"}, logical(1)) # F, T, F
vapply(LETTERS[0], function(x) {x == "B"}, logical(1)) # logical()
See ?vapply
for more info.
Upvotes: 22
Reputation: 174778
The help for the function ?sapply
has this in the Value section
For ‘sapply(simplify = TRUE)’ and ‘replicate(simplify = TRUE)’: if
‘X’ has length zero or ‘n = 0’, an empty list.
In both your cases:
> length(NULL)
[1] 0
> length(vector())
[1] 0
Hence sapply()
returns:
> sapply(vector(), function(x) {x == "B"})
list()
> sapply(NULL, function(x) {x == "B"})
list()
Your error is not from sapply()
but from [
as this shows:
> a[list()]
Error in a[list()] : invalid subscript type 'list'
So the issue is related to how subsetting of NULL
and an empty vector (vector()
) is performed. Nothing to do with sapply()
at all. In both cases it returns consistent output, an empty list.
Upvotes: 7
Reputation: 18323
Actually, they both return a list
. The only difference between the two is the when you try to index NULL
it always returns NULL (even if your index was a list), but when you try to index an empty vector, it checks the index, and realizes it is a list
.
a = NULL
res = sapply(a, function(x) x == "B") # Res is an empty list
a[res] # returns NULL, because any index of NULL is NULL.
a = vector()
res = sapply(a, function(x) x == "B") # Still an empty list.
a[res] # but you can't index a vector with a list!
Upvotes: 2