Sinnombre
Sinnombre

Reputation: 394

How to [join on] [where] in r data.tables

Is there a way to combine the join on and subset functionality in data.table? Say I have the following table:

dt = data.table(itemID = c(1,1,2,2),bucketID = c(1,2,2,3),value = 1:4)

I want to set the value to zero for the lowest bucket of each item. My thought was to run:

ends = dt[,.(min = min(bucketID)),itemID]
dt[ends,on="itemID",bucketID==min,value:=0]

i.e. join the tables, find where the two rows are identical, then update the value column. But this doesn't work. I can get the right results with:

ends = dt[,.(min = min(bucketID)),itemID]
dt = dt[ends,on="itemID"][bucketID==min,value:=0][,c(-4)]

However, this seems a bit circuitous. Is there a better way to combine join and where?

Upvotes: 4

Views: 63

Answers (1)

SymbolixAU
SymbolixAU

Reputation: 26248

Expanding on your join approach, you can join by both the itemID and the min values

dt[
    ends
    , on = c("itemID", bucketID = "min")
    , value := 0
]

dt
#    itemID bucketID value
# 1:      1        1     0
# 2:      1        2     2
# 3:      2        2     0
# 4:      2        3     4

Upvotes: 5

Related Questions