Alex
Alex

Reputation: 1294

replace row values based on another row value in a data.table

I have a trivial question, though I am struggling to find a simple answer. I have a data table that looks something like this:

dt <- data.table(id= c(A,A,A,A,B,B,B,C,C,C), time=c(1,2,3,1,2,3,1,2,3), score = c(10,15,13,25,NA,NA,18,29,19))
dt
#     id  time  score
# 1:   A    1    NA
# 2:   A    2    10
# 3:   A    3    15
# 4:   A    4    13
# 5:   B    1    NA
# 6:   B    2    25
# 7:   B    3    NA
# 8:   B    4    NA
# 9:   C    1    18
# 10:  C    2    29
# 11:  C    3    NA
# 12:  C    4    19

I would like to replace the missing values of my group "B" with the values of "A".

The final dataset should look something like this

final
#     id  time  score
# 1:   A    1    NA
# 2:   A    2    10
# 3:   A    3    15
# 4:   A    4    13
# 5:   B    1    NA
# 6:   B    2    25
# 7:   B    3    15
# 8:   B    4    13
# 9:   C    1    18
# 10:  C    2    29
# 11:  C    3    NA
# 12:  C    4    19

In other words, conditional on the fact that B is NA, I would like to replace the score of "A". Do note that "C" remains NA. I am struggling to find a clean way to do this using data.table. However, if it is simpler with other methods it would still be ok.

Thanks a lot for your help

Upvotes: 1

Views: 918

Answers (1)

akrun
akrun

Reputation: 886938

Here is one option where we get the index of the rows which are NA for 'score' and the 'id' is "B", use that to replace the NA with the corresponding 'score' value from 'A'

library(data.table)
i1 <- setDT(dt)[id == 'B', which(is.na(score))]
dt[, score:= replace(score, id == 'B' & is.na(score), score[which(id == 'A')[i1]])]

Or a similar option in dplyr

library(dplyr)
dt %>%
   mutate(score = replace(score, id == "B" & is.na(score), 
         score[which(id == "A")[i1]))

Upvotes: 1

Related Questions