Reputation: 65
df<-data.frame(ID = c(1,1,1,2,3,3,3),
test = c(NA, 5.5, 6.4, NA, 7.3, NA, 10.9))
I want to create a variable called "value", which is the first non-NA value for the test for each individual ID. For individual ID 2 who only has the NA, the value is NA.
The expected output is:
df<-data.frame(ID = c(1,1,1,2,3,3,3),
test = c(NA, 5.5, 6.4, NA, 7.3, NA, 10.9),
value = c(5.5, 5.5, 5.5, NA, 7.3, 7.3, 7.3))
Upvotes: 2
Views: 1160
Reputation: 101044
Here is a data.table
option using first
+ na.omit
> setDT(df)[, value := first(na.omit(test)), ID][]
ID test value
1: 1 NA 5.5
2: 1 5.5 5.5
3: 1 6.4 5.5
4: 2 NA NA
5: 3 7.3 7.3
6: 3 NA 7.3
7: 3 10.9 7.3
Upvotes: 1
Reputation: 39647
You can use ave
to group by ID and which.max
to select with [
the first non NA
value.
df$value <- ave(df$test, df$ID, FUN=function(x) x[which.max(!is.na(x))])
df
# ID test value
#1 1 NA 5.5
#2 1 5.5 5.5
#3 1 6.4 5.5
#4 2 NA NA
#5 3 7.3 7.3
#6 3 NA 7.3
#7 3 10.9 7.3
Upvotes: 3
Reputation: 886938
We can use first
on the non-NA elements after grouping
library(dplyr)
df <- df %>%
group_by(ID) %>%
mutate(value = first(test[complete.cases(test)]))
Upvotes: 2