Reputation: 111
I would like to fill the <NA>
with the correct factor value based on the ID
variable.
Here are the variables:
ID <- c(1,1,1,2,2,2,3,3,3)
Gender_NA <- c("m",NA,"m",NA,"f",NA,"m","m",NA)
Gender <- c("m","m","m","f","f","f","m","m","m")
Here are the data I have:
Data_have <- data.frame (ID,Gender_NA)
ID Gender_NA
1 m
1 <NA>
1 m
2 <NA>
2 f
2 <NA>
3 m
3 m
3 <NA>
Here are the data I whant to have:
Data_whant <- data.frame (ID,Gender)
ID Gender
1 m
1 m
1 m
2 f
2 f
2 f
3 m
3 m
3 m
I have tried to find the solution on this forum but I can´t get i to work.
Help would be much appreciated.
Upvotes: 2
Views: 868
Reputation: 887911
The na.locf
function from library(zoo)
can be used for replacing the NA
elements by the adjacent non-NA previous element. Using data.table
, we convert the 'data.frame' to 'data.table', grouped by 'ID', we replace the NA elements by the previous non-NA, if the first element is NA, it will not be replaced, we can use a second na.locf
with the option fromLast=TRUE
to replace the remaining NA with the succeeding non-NA elements.
library(zoo)
library(data.table)
setDT(Data_have)[, Gender := na.locf(na.locf(Gender_NA,
na.rm=FALSE),fromLast=TRUE), by = ID][, Gender_NA := NULL]
Data_have
# ID Gender
#1: 1 m
#2: 1 m
#3: 1 m
#4: 2 f
#5: 2 f
#6: 2 f
#7: 3 m
#8: 3 m
#9: 3 m
Or while grouping by ID
, we can omit all NAs using na.omit()
and pick the first element as follows:
setDT(Data_have)[, Gender := na.omit(Gender_NA)[1L], by = ID][, Gender_NA := NULL]
Or using the same method with dplyr
:
library(dplyr)
Data_have %>%
group_by(ID) %>%
transmute(Gender= first(na.omit(Gender_NA)))
# ID Gender
# (dbl) (fctr)
#1 1 m
#2 1 m
#3 1 m
#4 2 f
#5 2 f
#6 2 f
#7 3 m
#8 3 m
#9 3 m
Upvotes: 2
Reputation: 118889
Here's how I'd do using data.table
:
require(data.table) # v1.9.6+
dt = data.table(ID, Gender_NA)
# Gender_NA is of character type
And here's the answer:
dt[is.na(Gender_NA), Gender_NA := na.omit(dt)[.SD, Gender_NA, mult="first", on="ID"]]
Upvotes: 1