Reputation: 29968
I noticed that sometimes I get errors in my R scripts when I forget checking whether the dataframe I'm working on is actually empty (has zero rows).
For example, when I used apply like this
apply(X=DF,MARGIN=1,FUN=function(row) !any(vec[ row[["start"]]:row[["end"]] ]))
and DF
happened to be empty, I got an error about the subscripts.
Why is that? Aren't empty dataframes valid? Why does apply()
with MARGIN=1
even try to do anything when there are no rows in the dataframe? Do I really need to add a condition before each such apply to make sure the dataframe isn't empty?
Thank you!
Upvotes: 5
Views: 12703
Reputation: 108523
On a side note: apply always accesses the function you use at least once. If the input is a dataframe without any rows but with defined variables, it sends "FALSE" as an argument to the function. If the dataframe is completely empty, it sends a logical(0) to the function.
> x <- data.frame(a=numeric(0))
> str(x)
'data.frame': 0 obs. of 1 variable:
$ a: num
> y <- apply(x,MARGIN=1,FUN=function(x){print(x)})
[1] FALSE
> x <- data.frame()
> str(x)
'data.frame': 0 obs. of 0 variables
> y <- apply(x,MARGIN=1,FUN=function(x){print(x)})
logical(0)
So as Joshua already told you, either control before the apply whether the dataframe has rows, or add a condition in the function within the apply.
EDIT : This means you should take into account that length(x)==0 is not a very good check, you need to check whether either length(x==0) or !x is TRUE if both possibilities could arise : (Code taken from Joshua)
apply(X=data.frame(),MARGIN=1, # empty data.frame
FUN=function(row) {
if(length(row)==0 || !row) {return()}
!any(vec[ row[["start"]]:row[["end"]] ])
})
Upvotes: 3
Reputation: 176638
This has absolutely nothing to do with apply
. The function you are applying does not work when the data.frame is empty.
> myFUN <- function(row) !any(vec[ row[["start"]]:row[["end"]] ])
> myFUN(DF[1,]) # non-empty data.frame
[1] FALSE
> myFUN(data.frame()[1,]) # empty data.frame
Error in row[["start"]]:row[["end"]] : argument of length 0
Add a condition to your function.
> apply(X=data.frame(),MARGIN=1, # empty data.frame
+ FUN=function(row) {
+ if(length(row)==0) return()
+ !any(vec[ row[["start"]]:row[["end"]] ])
+ })
NULL
Upvotes: 3
Reputation: 996
I would use mapply instead:
kk <- data.frame( start = integer(0), end = integer(0) )
kkk <- data.frame( start = 1, end = 3 )
vect <- rnorm( 100 ) > 0
with(kk, mapply( function(x, y) !any( vect[x]:vect[y] ), start, end ) )
with(kkk, mapply( function(x, y) !any( vect[x]:vect[y] ), start, end ) )
Upvotes: 1
Reputation: 50704
I don't think it's related to 0-row data.frame:
X <- data.frame(a=numeric(0))
str(X)
# 'data.frame': 0 obs. of 1 variable:
# $ a: num
apply(X,1,sum)
# integer(0)
Try use traceback()
after error to see what exactly cause it.
Upvotes: 1