Reputation: 415
I want to update the values of a table based on the value found in a specific column of another table. i.e.
tb1<-data.table(w=c('xray','yankee','zulu','alpha','bravo','charlie',
'xray','yankee','zulu','alpha','bravo','charlie'),
x=c('alpha','bravo','charlie','xray','yankee','zulu',
'xray','yankee','zulu','alpha','bravo','charlie'),
y=c('zulu','yankee','xray','charlie','bravo','alpha'))
tb2<-data.table(z= c('alpha','bravo','charlie'),
z2=c('zalpha','zbravo','zcharlie'))
In column "w"of tb1, values that do not appear in col "z" of tb2 should be replaced by "NONE". This code actually does the work for this short table.
tb1[,"w":=lapply(tb1$w,function(u){ifelse((u %in% tb2$z),u,"NONE")})]
However, the actual table has more than two million records and the above instruction is very slow (>20 mins before stopping it). Eventually, I need to substitute all values in columns "w',"x" and "y" of tb1 that are not in tb2$z. What would be the correct "data.table" approach to accomplish this? I have tried subsetting it assuming that it would be more efficient to group the values before doing a replacement. None of the combinations I have used with .SD give me the correct answer. Thanks for any help.
Upvotes: 1
Views: 1152
Reputation: 887153
We can use %chin%
to create the logical index of elements in 'w' that are not found in 'z', then assign (:=
) the corresponding elements in 'w' to "NONE".
tb1[!w %chin% tb2$z, w:= "NONE"]
tb1
# w x y
# 1: NONE alpha zulu
# 2: NONE bravo yankee
# 3: NONE charlie xray
# 4: alpha xray charlie
# 5: bravo yankee bravo
# 6: charlie zulu alpha
# 7: NONE xray zulu
# 8: NONE yankee yankee
# 9: NONE zulu xray
#10: alpha alpha charlie
#11: bravo bravo bravo
#12: charlie charlie alpha
If we need to change all the columns in 'tb1' based on the 'z' column of 'tb2', we can loop through the columns of 'tb1' and assign as showed earlier.
nm1 <- names(tb1)
for(j in seq_along(tb1)){
tb1[!tb1[[j]] %chin% tb2$z, nm1[j] := "NONE"]
}
tb1
# w x y
# 1: NONE alpha NONE
# 2: NONE bravo NONE
# 3: NONE charlie NONE
# 4: alpha NONE charlie
# 5: bravo NONE bravo
# 6: charlie NONE alpha
# 7: NONE NONE NONE
# 8: NONE NONE NONE
# 9: NONE NONE NONE
#10: alpha alpha charlie
#11: bravo bravo bravo
#12: charlie charlie alpha
Upvotes: 1