Fabio Correa
Fabio Correa

Reputation: 1363

How to merge two data.tables with complementary column data in one go?

I have two data.tables, columns v2 of each one are complementary:

set.seed(1234)
v1 <- sample(1:20, 5)
v2a <- c(1:2,NA,NA,NA)
v2b <- c(NA,NA,3:5)
id <- c(letters[1:5])

library(data.table)
dt1 <- data.table(id = id, v1=v1,v2=v2a)
dt2 <- data.table(id = id, v2=v2b)

dt1
   id v1 v2
1:  a 16  1
2:  b  5  2
3:  c 12 NA
4:  d 15 NA
5:  e  9 NA

dt2
   id v2
1:  a NA
2:  b NA
3:  c  3
4:  d  4
5:  e  5

The goal is to merge the two data.tables and have column v2 with the proper values without NA. I got it correctly done either by:

dt <- rbindlist(list(dt1,dt2), use.names = T, fill = T)
dt <- dt[,v2:= sum(v2, na.rm = T), by = id]
dt <- dt[!is.na(v1)]

or:

dt <- merge(dt1, dt2, by = "id", all = T)
dt[, v2:=sum(v2.x, v2.y, na.rm = T), by = id][, v2.x := NULL][,v2.y := NULL]

both giving the correct desired result:

dt
   id v1 v2
1:  a 16  1
2:  b  5  2
3:  c 12  3
4:  d 15  4
5:  e  9  5

Is there an easier/one go way to do it?

Upvotes: 1

Views: 153

Answers (2)

Uwe
Uwe

Reputation: 42544

There is another, less convoluted approach which uses the fcoalesce() function which was introduced with data.table v1.12.4 (on CRAN 03 Oct 2019):

dt1[dt2, on = .(id), v2 := fcoalesce(x.v2, i.v2)][]
   id v1 v2
1:  a 16  1
2:  b  5  2
3:  c 12  3
4:  d 15  4
5:  e  9  5
dt1[dt2, on = .(id), v2 := fcoalesce(v2, i.v2)][]

works as well because

dt1[dt2, on = .(id)]

returns

   id v1 v2 i.v2
1:  a 16  1   NA
2:  b  5  2   NA
3:  c 12 NA    3
4:  d 15 NA    4
5:  e  9 NA    5

Upvotes: 2

Wimpel
Wimpel

Reputation: 27732

The code below updates the values of dt1$v2 where is.na(dt1$v2) == TRUE with the values of dt$v2, based on id.

dt1[is.na(v2), v2 := dt2[ dt1[is.na(v2),], v2, on = .(id)] ][]

   id v1 v2
1:  a 16  1
2:  b  5  2
3:  c 12  3
4:  d 15  4
5:  e  9  5

Upvotes: 2

Related Questions