Julian
Julian

Reputation: 791

Issue with filtering data.table for dates - works with fixed date but not with variable

I want to filter a data.table for dates - given start- and end-dates, I want to have all the rows which include a given date.

Now my problem: It works when I use a fixed date, but it returns an empty table when I want to store the date in an external variable. Any hints?

dt = data.table(begin=as.Date('2014-01-01'):as.Date('2014-01-10'),
                end=as.Date('2014-01-01'):as.Date('2014-01-10')+c(1,1,1,2:8),
                x=c('A','B','C','D','E','J','J','J','J','J'))
dt[,`:=`(begin_idate=as.IDate(begin, origin='1970-1-1'),
         end_idate=as.IDate(end, origin= '1970-1-1'))]
dt[as.Date('2014-01-09')>begin_idate  & as.Date('2014-01-09')<=end_idate ] # works
x=as.Date('2014-01-09')
dt[x>begin_idate  & x<=end_idate ] #doesnt' work - empty data.table

Upvotes: 4

Views: 110

Answers (1)

Rentrop
Rentrop

Reputation: 21507

This is due to the fact, that one of the columns of your data.table is named x. As data.table evaluates its first argument (the one you are using) it looks for x as column reference first. So in fact x>begin_idate is interpreted as

dt$x > dt$begin_idate 

Which throws an error.

To fix this use a name that is not already a column name of dt. For example

xx = as.Date('2014-01-09')
dt[xx>begin_idate  & xx<=end_idate ]

Result:

> dt[xx>begin_idate  & xx<=end_idate ]
   begin   end x begin_idate  end_idate
1: 16076 16080 J  2014-01-06 2014-01-10
2: 16077 16082 J  2014-01-07 2014-01-12
3: 16078 16084 J  2014-01-08 2014-01-14

In fact the results are equal:

all.equal(dt[as.Date('2014-01-09')>begin_idate  & as.Date('2014-01-09')<=end_idate ],
          dt[xx>begin_idate  & xx<=end_idate ]) # TRUE

Upvotes: 7

Related Questions