Snowflake
Snowflake

Reputation: 3081

How to transform a data.table into a list of columns that change per ID

I was wondering whether someone could help me with the following. I am unsure how to do this in data.table, but I believe that is possible.

Suppose we have the following data.table:

test <- data.table(ID = c(1,1,1,2,2,2,2,2,3,3,3), A = rep(1,11), B = c(1,2,3,1,1,1,1,1,1,1,1), C = c(1,2,1,2,2,3,3,1,1,1,1))

We can clearly see that given the ID = 1, the column B and C changes, given ID = 2, the column C only changes and given ID = 3, none of the column changes. I am unsure how to do this without writing a for loop, which is not feasible given the data size. The output should be:

ID | ChangedColumn
1  | B
1  | C
2  | C

Upvotes: 0

Views: 46

Answers (2)

talat
talat

Reputation: 70286

You can convert to long format and then compute the changes:

melt(test, id.vars = "ID")[, uniqueN(value) > 1, by = .(ID, variable)][(V1), !"V1"]

   ID variable
1:  1        B
2:  1        C
3:  2        C

If you're still using an older data.table version that doesn't support column negation using !, you can use this instead:

melt(test, id.vars = "ID")[, uniqueN(value) > 1, by = .(ID, variable)][(V1), 
                            .(ID, variable)]

Upvotes: 3

minem
minem

Reputation: 3650

or you can get the results like matrix with:

test[, as.list(lapply(.SD, uniqueN) != 1L), by = ID]
#    ID     A     B     C
# 1:  1 FALSE  TRUE  TRUE
# 2:  2 FALSE FALSE  TRUE
# 3:  3 FALSE FALSE FALSE

Upvotes: 0

Related Questions