Update values in data.table with values from another data.table

Question

I have a dataset with around 25 million rows. I am taking a subset of these rows and performing a function which works fine. However, what I then need to do is update the values in original dataset with new values while retaining the rest. I am sure this is straightforward but I just can't get my head around it.

This is a simplified version of what I am dealing with:

require("data.table")

df <-data.frame(AREA_CD = c(sample(1:25000000, 25000000, replace=FALSE)), ALLOCATED = 0, ASSIGNED = "A", ID_CD = c(1:25000000))
df$ID_CD <- interaction( "ID", df$ID_CD, sep = "")
dt <- as.data.table(df)

sub_dt <- dt[5:2004,]
sub_dt[,ALLOCATED:=ALLOCATED+1]
sub_dt[,ASSIGNED:="B"]

What I am after is the values in 'ALLOCATED' and 'ASSIGNED' from sub_dt to replace the 'ALLOCATED' and 'ASSIGNED' values in dt based on the 'ID_CD' column. The output I would be after, based on my example, would still have 25 million rows but have 2,000 updated rows. Any help would be much appreciated. Thanks.

Update values in data.table with values from another data.table

Answers (1)

Benchmark

Related Questions