Reputation: 73
As I am new to the data.table package, I would like to replicate what I would normally do in a data.frame structure below, to a data.table structure.
Dta <- data.frame(Customer = c("Javier","Oscar","Ivan","Peter"),Type_of_Customer=LETTERS[c(1,1:3)])
Dtb <- data.frame(Customer = c("Javier","Oscar","Ivan","Jack"),Zone=5:8,District=100:103)
Result <- cbind(Dtb[match(Dtb[,"Customer"],Dta[,"Customer"]),c("Zone","District")],Dta)
ww <- which(is.na(Result[,"Zone"]))
if(length(ww) > 0){
Result[ww,"Zone"] <- "Not in Dtb"
}
ww <- which(is.na(Result[,"District"]))
if(length(ww) > 0){
Result[ww,"District"] <- "Not in Dtb"
}
So If I had Dta
and Dtb
as data.table structure, what would be the way to go?
(Note: In the real sample I have around 10 million rows so I would need the more time-efficient solution)
Dta <- data.table(Custumer = c("Javier","Oscar","Ivan","Peter"),Type_of_Customer=LETTERS[c(1,1:3)])
Dtb <- data.table(Custumer = c("Javier","Oscar","Ivan","Jack"),Zone=5:8,District=100:103)
Thanks.
Upvotes: 3
Views: 172
Reputation: 887168
We can use a join on
thee 'Custumer' and replace the NA
elements with 'Not in 'Dtb' string
Dtb[Dta, on = .(Custumer)][, c("Zone", "District") :=
.(as.character(Zone), as.character(District))
][is.na(Zone), c("Zone", "District") := "Not in Dtb"][]
# Custumer Zone District Type_of_Customer
#1: Javier 5 100 A
#2: Oscar 6 101 A
#3: Ivan 7 102 B
#4: Peter Not in Dtb Not in Dtb C
Upvotes: 2