Reputation: 671
For a given ID, I would like to convert all values to "yes" if a "yes" is present in any year and convert values to "no" if only "no" is present in all years. Here is an example:
data <- data.frame(
id=c(1,1,2,2,3,3,4,4,5,5),
year=rep(c(2010, 2011), 5),
employ=c("yes", "yes", "no", "yes", "yes", "no", NA, "yes", "no", NA))
> data
id year employ
1 1 2010 yes
2 1 2011 yes
3 2 2010 no
4 2 2011 yes
5 3 2010 yes
6 3 2011 no
7 4 2010 <NA>
8 4 2011 yes
9 5 2010 no
10 5 2011 <NA>
Desired output:
data2 <- data.frame(
id=c(1,1,2,2,3,3,4,4,5,5),
year=c(2010, 2011, 2010, 2011, 2010, 2011, 2010, 2011, 2010, 2011),
employ=c("yes", "yes", "yes", "yes", "yes", "yes","yes", "yes","no", "no"))
> data2
id year employ
1 1 2010 yes
2 1 2011 yes
3 2 2010 yes
4 2 2011 yes
5 3 2010 yes
6 3 2011 yes
7 4 2010 yes
8 4 2011 yes
9 5 2010 no
10 5 2011 no
Upvotes: 1
Views: 54
Reputation: 616
You can group and use any
, after converting NA
to "no"
.
data %>%
group_by(id) %>%
mutate(
employ = replace_na(employ, "no"),
employ = case_when(any(employ == "yes") ~ "yes",
TRUE ~ "no"),
) %>% ungroup()
# id year employ
# 1 1 2010 yes
# 2 1 2011 yes
# 3 2 2010 yes
# 4 2 2011 yes
# 5 3 2010 yes
# 6 3 2011 yes
# 7 4 2010 yes
# 8 4 2011 yes
# 9 5 2010 no
# 10 5 2011 no
Upvotes: 1
Reputation: 887851
An option is to convert to factor
with levels
specified and select the first level
after dropping the levels
library(dplyr)
data %>%
group_by(id) %>%
mutate(employ = levels(droplevels(factor(employ,
levels = c('yes', 'no'))))[1]) %>%
ungroup
-output
# A tibble: 10 x 3
# id year employ
# <dbl> <dbl> <chr>
# 1 1 2010 yes
# 2 1 2011 yes
# 3 2 2010 yes
# 4 2 2011 yes
# 5 3 2010 yes
# 6 3 2011 yes
# 7 4 2010 yes
# 8 4 2011 yes
# 9 5 2010 no
#10 5 2011 no
If there are all NA
for a particular 'id', it returns NA
Or use a condition with if/else
data %>%
group_by(id) %>%
mutate(employ = if('yes' %in% employ) 'yes' else 'no') %>%
ungroup
Upvotes: 1