replace row values based on another row value in a data.table

Question

I have a trivial question, though I am struggling to find a simple answer. I have a data table that looks something like this:

dt <- data.table(id= c(A,A,A,A,B,B,B,C,C,C), time=c(1,2,3,1,2,3,1,2,3), score = c(10,15,13,25,NA,NA,18,29,19))
dt
#     id  time  score
# 1:   A    1    NA
# 2:   A    2    10
# 3:   A    3    15
# 4:   A    4    13
# 5:   B    1    NA
# 6:   B    2    25
# 7:   B    3    NA
# 8:   B    4    NA
# 9:   C    1    18
# 10:  C    2    29
# 11:  C    3    NA
# 12:  C    4    19

I would like to replace the missing values of my group "B" with the values of "A".

The final dataset should look something like this

final
#     id  time  score
# 1:   A    1    NA
# 2:   A    2    10
# 3:   A    3    15
# 4:   A    4    13
# 5:   B    1    NA
# 6:   B    2    25
# 7:   B    3    15
# 8:   B    4    13
# 9:   C    1    18
# 10:  C    2    29
# 11:  C    3    NA
# 12:  C    4    19

In other words, conditional on the fact that B is NA, I would like to replace the score of "A". Do note that "C" remains NA. I am struggling to find a clean way to do this using data.table. However, if it is simpler with other methods it would still be ok.

Thanks a lot for your help

akrun · Accepted Answer

Here is one option where we get the index of the rows which are NA for 'score' and the 'id' is "B", use that to replace the NA with the corresponding 'score' value from 'A'

library(data.table)
i1 <- setDT(dt)[id == 'B', which(is.na(score))]
dt[, score:= replace(score, id == 'B' & is.na(score), score[which(id == 'A')[i1]])]

Or a similar option in dplyr

library(dplyr)
dt %>%
   mutate(score = replace(score, id == "B" & is.na(score), 
         score[which(id == "A")[i1]))

replace row values based on another row value in a data.table

Answers (1)

Related Questions