Reputation: 3081
I was wondering whether someone could help me with the following. I am unsure how to do this in data.table, but I believe that is possible.
Suppose we have the following data.table:
test <- data.table(ID = c(1,1,1,2,2,2,2,2,3,3,3), A = rep(1,11), B = c(1,2,3,1,1,1,1,1,1,1,1), C = c(1,2,1,2,2,3,3,1,1,1,1))
We can clearly see that given the ID = 1, the column B and C changes, given ID = 2, the column C only changes and given ID = 3, none of the column changes. I am unsure how to do this without writing a for loop, which is not feasible given the data size. The output should be:
ID | ChangedColumn
1 | B
1 | C
2 | C
Upvotes: 0
Views: 46
Reputation: 70286
You can convert to long format and then compute the changes:
melt(test, id.vars = "ID")[, uniqueN(value) > 1, by = .(ID, variable)][(V1), !"V1"]
ID variable
1: 1 B
2: 1 C
3: 2 C
If you're still using an older data.table version that doesn't support column negation using !
, you can use this instead:
melt(test, id.vars = "ID")[, uniqueN(value) > 1, by = .(ID, variable)][(V1),
.(ID, variable)]
Upvotes: 3
Reputation: 3650
or you can get the results like matrix with:
test[, as.list(lapply(.SD, uniqueN) != 1L), by = ID]
# ID A B C
# 1: 1 FALSE TRUE TRUE
# 2: 2 FALSE FALSE TRUE
# 3: 3 FALSE FALSE FALSE
Upvotes: 0