Is there any way to replace a missing value based on another columns' value to match the column name

Question

I have a dataset:

    a day day.1.time day.2.time day.3.time day.4.time day.5.time
1  NA   2          4          5          7         10          4
2  NA   5          4          1          1          6         NA
3  NA   3          7          9          6          7          4
4  NA   3          6          8          8          4          5
5  NA   3          5          2          4          5          6
6  NA   3         87          3          2          1         78
7  NA   1         NA          7          5          9         54
8  NA   5          6          6          3          2          3
9  NA   2          5         10          9          8          3
10 NA   3          9          4         10          3          3

I am trying to use the day column value to match with the day.x.time column to replace the missing value in column a. For instance, in the first row, the first value in the day column is 2, then we should use day.2.time value 5 to replace the first value in column a.

If the day.x.time value is missing, we should use -1 day or +1 day to replace the missing in column a. For instance, in the second row, the day column shows 5, so we should use the value in day.5.time column, but it's also a missing value. In this case, we should use the value in day.4.time column to replace the missing value in column a.

You can use dat = data.frame(a = rep(NA,10), day = c(2,5,3,3,3,3,1,5,2,3), day.1.time = c(4,4,7,6,5,87,NA,6,5,9), day.2.time = sample(10), day.3.time = sample(10), day.4.time = sample(10), day.5.time = c(4,NA,4,5,6,78,54,3,3,3)) to generate the sample data.

I have tried grep(paste0("^day."dat$day,".time$", names(dat)) to match with the column but my code isn't matching in every row, so any help would be appreciated!

Ronak Shah · Accepted Answer

Here is one way to do this.

The first part is easy to match day column with the corresponding day.x.time column. We can do this using matrix subsetting.

cols <- grep('day\.\d+\.time', names(dat))
dat$a <- dat[cols][cbind(1:nrow(dat), dat$day)]
dat
#    a day day.1.time day.2.time day.3.time day.4.time day.5.time
#1   3   2          4          3          3          3          4
#2  NA   5          4          4         10          2         NA
#3   1   3          7          8          1          8          4
#4   4   3          6          6          4          5          5
#5   6   3          5         10          6          7          6
#6   8   3         87          5          8          9         78
#7  NA   1         NA          1          7         10         54
#8   3   5          6          7          9          1          3
#9   2   2          5          2          5          6          3
#10  2   3          9          9          2          4          3

To fill values where day.x.time column is NA we can select the closest non-NA value in that row.

inds <- which(is.na(dat$a))
dat$a[inds] <- mapply(function(x, y) 
         na.omit(unlist(dat[x, cols[order(abs(y- seq_along(cols)))]])[1:4])[1],
              inds, dat$day[inds])

dat
#   a day day.1.time day.2.time day.3.time day.4.time day.5.time
#1  3   2          4          3          3          3          4
#2  2   5          4          4         10          2         NA
#3  1   3          7          8          1          8          4
#4  4   3          6          6          4          5          5
#5  6   3          5         10          6          7          6
#6  8   3         87          5          8          9         78
#7  1   1         NA          1          7         10         54
#8  3   5          6          7          9          1          3
#9  2   2          5          2          5          6          3
#10 2   3          9          9          2          4          3

Is there any way to replace a missing value based on another columns' value to match the column name

Answers (2)

Edit

Related Questions

Is there any way to replace a missing value based on another columns&#39; value to match the column name

Answers (2)

Edit

Related Questions

Is there any way to replace a missing value based on another columns' value to match the column name