Hossein
Hossein

Reputation: 471

Updating values in some rows using values in other rows, based on some conditions on other columns

I have a huge dataset in which I would like to update the values in some of the rows (i.e, that have scenario=="D") with the value in another cell in the same column (i.e., column outcome) from another row (i.e, that has scenario=="C"), conditional on having the same values for other columns (i.e., year and country)

df <- data.frame(year=c("2000", "2000", "2001", "2001"),
                 country=c("A", "A", "B", "B"),
                 scenario=c("C", "D", "C", "D"),
                 outcome=c("1", "2", "3", "4"))

I would like to generate this:

df2 <- data.frame(year=c("2000", "2000", "2001", "2001"),
                 country=c("A", "A", "B", "B"),
                 scenario=c("C", "D", "C", "D"),
                 outcome=c("1", "1", "3", "3"))

I would appreciate any help.

Upvotes: 0

Views: 27

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You can use replace to conditionally replace the values for each group of year and country.

library(dplyr)

df %>%
  group_by(year, country) %>%
  mutate(outcome = replace(outcome, scenario == 'D', 
                           outcome[match('C', scenario)])) %>%
  ungroup

#   year  country scenario outcome
#  <chr> <chr>   <chr>    <chr>  
#1 2000  A       C        1      
#2 2000  A       D        1      
#3 2001  B       C        3      
#4 2001  B       D        3       

Upvotes: 2

Related Questions