Reputation: 2591
I guess there is a bug in the unique-function of the data.table (1.9.6) package:
Small example:
test <- data.table(a = c("1", "1", "2", "2", "3", "4", "4", "4"),
b = letters[1:8],
d = c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE))
a b d
1: 1 a TRUE
2: 1 b TRUE
3: 2 c FALSE
4: 2 d FALSE
5: 3 e TRUE
6: 4 f FALSE
7: 4 g FALSE
8: 4 h FALSE
test[d == TRUE, `:=` (b = "M")]
test <- unique(test, by = c("a", "b"))
a b d
1: 1 M TRUE
2: 2 c FALSE
3: 2 d FALSE
4: 3 M TRUE
5: 4 f FALSE
6: 4 g FALSE
7: 4 h FALSE
At this point everything is perfect but now I want to select only rows where column d is true:
test[d == TRUE]
a b d
1: 1 M TRUE
But the result is wrong.
Upvotes: 4
Views: 103
Reputation: 16727
That bug was just fixed in development repository.
library(data.table)
test <- data.table(a = c("1", "1", "2", "2", "3", "4", "4", "4"),
b = letters[1:8],
d = c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE))
test[d == TRUE, `:=` (b = "M")]
test <- unique(test, by = c("a", "b"))
test[d == TRUE]
# a b d
#1: 1 M TRUE
#2: 3 M TRUE
Development version data.table was already published in drat repo and can be easily installed by:
install.packages("data.table", repos="https://Rdatatable.github.io/data.table", type="source")
Thanks for reporting!
Upvotes: 5
Reputation: 680
Without solving the bug, it does work with normal data.frame syntax:
test[test$d, ]
or
test[test$d == TRUE, ]
Upvotes: 0