Reputation: 471
I have a huge dataset in which I would like to update the values in some of the rows (i.e, that have scenario=="D"
) with the value in another cell in the same column (i.e., column outcome
) from another row (i.e, that has scenario=="C"
), conditional on having the same values for other columns (i.e., year
and country
)
df <- data.frame(year=c("2000", "2000", "2001", "2001"),
country=c("A", "A", "B", "B"),
scenario=c("C", "D", "C", "D"),
outcome=c("1", "2", "3", "4"))
I would like to generate this:
df2 <- data.frame(year=c("2000", "2000", "2001", "2001"),
country=c("A", "A", "B", "B"),
scenario=c("C", "D", "C", "D"),
outcome=c("1", "1", "3", "3"))
I would appreciate any help.
Upvotes: 0
Views: 27
Reputation: 388982
You can use replace
to conditionally replace the values for each group of year
and country
.
library(dplyr)
df %>%
group_by(year, country) %>%
mutate(outcome = replace(outcome, scenario == 'D',
outcome[match('C', scenario)])) %>%
ungroup
# year country scenario outcome
# <chr> <chr> <chr> <chr>
#1 2000 A C 1
#2 2000 A D 1
#3 2001 B C 3
#4 2001 B D 3
Upvotes: 2