Reputation: 781
The last time I asked the question about fill in the same value for each row by group in R, now I deal with exact the same problem but there are some missing value NA. Here is data,blank"" means that person doesn't exposed in that window, NA treats as missing, 1st means the person exposed in the first window..
ID <- c(1,1,2,2,2,3,3,4,4,4)
x <- c("1st","","1st","1st","","",NA,"1st",NA,"1st")
y <- c("2nd","2nd","","","","2nd","2nd","","",NA)
z <- c("","","3rd","3rd","",NA,"3rd","",NA,"")
m <- c(10:19)
n <- c(20:29)
df <- data.frame(ID,x,y,z,m,n)
library(data.table)
setDT(df)[, c("x1", "y1", "z1") := lapply(.SD, function(x) x[which.max(x != "")]), by = ID]
I got the output, it's pretty much the one I want except NA
ID x y z m n x1 y1 z1
1: 1 1st 2nd 10 20 1st 2nd
2: 1 2nd 11 21 1st 2nd
3: 2 1st 3rd 12 22 1st 3rd
4: 2 1st 3rd 13 23 1st 3rd
5: 2 14 24 1st 3rd
6: 3 2nd NA 15 25 2nd 3rd
7: 3 NA 2nd 3rd 16 26 2nd 3rd
8: 4 1st 17 27 1st
9: 4 NA NA 18 28 1st
10: 4 1st NA 19 29 1st
You can see row 6 and 7, ID is 3, it's supposed to fill x1 = NA, row 8,9,10, ID is 4, y1 and z1 will be NA, here is output I want
ID x y z m n x1 y1 z1
1: 1 1st 2nd 10 20 1st 2nd
2: 1 2nd 11 21 1st 2nd
3: 2 1st 3rd 12 22 1st 3rd
4: 2 1st 3rd 13 23 1st 3rd
5: 2 14 24 1st 3rd
6: 3 2nd NA 15 25 NA 2nd 3rd
7: 3 NA 2nd 3rd 16 26 NA 2nd 3rd
8: 4 1st 17 27 1st NA NA
9: 4 NA NA 18 28 1st NA NA
10: 4 1st NA 19 29 1st NA NA
Upvotes: 1
Views: 1397
Reputation: 214957
How about recoding condition for NA to be 0.5 which will prioritize NA over empty string but less than other strings:
df[, c("x1", "y1", "z1") := lapply(.SD, function(x) x[which.max(ifelse(is.na(x), 0.5, x != ""))]), by = ID]
df
# ID x y z m n x1 y1 z1
# 1: 1 1st 2nd 10 20 1st 2nd
# 2: 1 2nd 11 21 1st 2nd
# 3: 2 1st 3rd 12 22 1st 3rd
# 4: 2 1st 3rd 13 23 1st 3rd
# 5: 2 14 24 1st 3rd
# 6: 3 2nd NA 15 25 NA 2nd 3rd
# 7: 3 NA 2nd 3rd 16 26 NA 2nd 3rd
# 8: 4 1st 17 27 1st NA NA
# 9: 4 NA NA 18 28 1st NA NA
#10: 4 1st NA 19 29 1st NA NA
Upvotes: 1