Reputation: 22293
I have several data.tables
that I would like to rbindlist
. The tables contain factors with (possibly missing) levels. Then rbindlist(...)
behaves differently from do.call(rbind(...))
:
dt1 <- data.table(x=factor(c("a", "b"), levels=letters))
rbindlist(list(dt1, dt1))[,x]
## [1] a b a b
## Levels: a b
do.call(rbind, list(dt1, dt1))[,x]
## [1] a b a b
## Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
If I want to keep the levels, do I have tor resort to rbind
or is there a data.table
way?
Upvotes: 6
Views: 408
Reputation: 49448
Thanks for pointing out this problem. As of version 1.8.11 it has been fixed:
dt1 <- data.table(x=factor(c("a", "b"), levels=letters))
rbindlist(list(dt1, dt1))[,x]
#[1] a b a b
#Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
Upvotes: 2
Reputation: 121568
I guess rbindlist
is faster because it doesn't do the checking of do.call(rbind.data.frame,...)
Why not to set the levels after binding?
Dt <- rbindlist(list(dt1, dt1))
setattr(Dt$x,"levels",letters) ## set attribute without a copy
from the ?setattr
:
setattr() is useful in many situations to set attributes by reference and can be used on any object or part of an object, not just data.tables.
Upvotes: 4