Reputation: 146
I've the following data frame:
library(dplyr)
dat <- data_frame(id = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 5L, 5L, 7L, 7L, 7L, 8L, 8L, 8L, 10L),
wish1 = c(4L, NA, NA, 1L, NA, 1L, NA, NA, NA,
NA, -1L, 8L, NA, 1L, -1L, NA, 4L,
NA, NA, -1L),
wish2 = c(1L, NA, NA, 1L, NA, 1L, NA, NA, NA,
NA, -1L, 1L, NA, 2L, -1L, NA, 2L, NA, NA, 1L),
participate = c(NA, 1L, NA, NA, 1L, NA, NA, 1L, NA, NA, NA,
NA, 1L, NA, 4L, NA, NA, NA, 1L, NA))
I want to replace within each group the NA
s of variable participate
with the values which are available within the same group. If there are no values within the group, then the NA
can stay.
I need something like:
df <- data %>% group_by(id) %>%
mutate(participate = (participate, na.rm = TRUE))
Unfortunately this doesn't work without a function like sum
or anything.
Upvotes: 1
Views: 589
Reputation: 39174
There are probably more concise or elegant ways, but I would like to share some thoughts.
library(tidyr)
# the fill function can fill the NA based on the previous entry
dat2 <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
fill(participate)
# dat_temp is a summary data frame showing the fill values
dat_temp <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
slice(1) %>%
select(id, participate)
# Join dat_temp to dat2
dat2 <- dat %>%
left_join(dat_temp, by = "id") %>%
select(-participate.x) %>%
rename(participate = participate.y)
This solution is based on the comment from alistaire
dat2 <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
mutate(participate = first(participate))
Upvotes: 2