How to merge data.tables keeping a column pair mapping fixed?

Question

I have two data.table

>a <- data.table(code=c('FI', 'NO', 'SW'), name=c('Finland', 'Norway', 'Sweden'), category=c('A', 'B', 'C'), val_1=c(1,2,3))

>a
   code    name category val_1
1:   FI Finland        A     1
2:   NO  Norway        B     2
3:   SW  Sweden        C     3

> b <- data.table(code=c('FI', 'NO', 'FI'), category=c('A', 'B', 'C'), val_2=c(4,5,6))
> b
   code category val_2
1:   FI        A     4
2:   NO        B     5
3:   FI        C     6

If I merge these data.tables I get the expected output

> merge(a, b, all=T, by=c('code', 'category'))
   code category    name val_1 val_2
1:   FI        A Finland     1     4
2:   FI        C      NA    NA     6
3:   NO        B  Norway     2     5
4:   SW        C  Sweden     3    NA

However, the output I am looking for is

   code category    name val_1 val_2
1:   FI        A Finland     1     4
2:   FI        C Finland    NA     6
3:   NO        B  Norway     2     5
4:   SW        C  Sweden     3    NA

where the country name is taken from a. How can I do this?

Frank · Accepted Answer

I'd just spin off the code-to-name mapping and add it back in when needed.

codemap <- a[,name,keyby=code]
a[,name:=NULL]

m <- merge(a,b,all=TRUE,by=c('code','category'))
#    code category val_1 val_2
# 1:   FI        A     1     4
# 2:   FI        C    NA     6
# 3:   NO        B     2     5
# 4:   SW        C     3    NA

setkey(m,NULL)
codemap[m]
#    code    name category val_1 val_2
# 1:   FI Finland        A     1     4
# 2:   FI Finland        C    NA     6
# 3:   NO  Norway        B     2     5
# 4:   SW  Sweden        C     3    NA

How to merge data.tables keeping a column pair mapping fixed?

Answers (2)

Related Questions