Reputation: 10444
I have started using data.table. Indeed it is very fast and quite nice syntax. I am having trouble with dates. I like to use lubridate. In many of my data sets I have dates or dates and times and have used lubridate to manipulate them. Lubridate stores the instant as a POSIX class. I have seen answers here that create new variables for instance just to get the year eg. 2005. I do not like that. There are times that I will be analyzing by year and other times by quarter and other times by month and other times by durations. I would like to do something simple such as this
mydatatable[,length(medical.record.number),by=year(date.of.service)]
that should give me the number of patient encounters in a given year. The by function is not working.
Error in names(byval) = as.character(bysuborig) :
'names' attribute [2] must be the same length as the vector [1]
Can you please point me to vignettes where data.tables is used with dates and where manipulations and categorizations of those dates are done on the fly.
Upvotes: 2
Views: 1382
Reputation: 263362
This uses one of the examples in the help(IDateTime)
page. It shows that you canc hange to syntax for the by=
argument to a character value in the form " = " or (after @Matthew Dowle's comment below) you can try to use the functional form that you were using (although I have not been able to get it to work myself. I did get the preferred form: by=list(wday=wday(idate))
to work.) Note that the key creation assumes an IDateTime class since there is no idate
or itime
variable. Those are attributes of the class
datetime <- seq(as.POSIXct("2001-01-01"), as.POSIXct("2001-01-03"), by = "5 hour")
(af <- data.table(IDateTime(datetime), a = rep(1:2, 5), key = "a,idate,itime"))
af[, length(a), by = "wday = wday(idate)"]
wday V1
[1,] 2 4
[2,] 3 5
[3,] 4 1
Upvotes: 3