Reputation: 351
I am trying to create a new variable, Col3, conditionally and label it for all those entries which are >= 2016 for the below data frame:
Data_Frame <- data.frame(Col1 = c("A1", "A1", "A1", "A1", "A1", "A1", "A1", "A2", "A2", "A2", "A3", "A3", "A3", "A3", "A3"),
Col2 = c(2010, 2016, 2019, 2019, 2020, 2020, 2020, 2017, 2018, 2021, 2002, 2009, 2020, 2020, 2021))
The expected result is shown below in the table as a pic, where Col3 is calculated based on groups in Col1 such that any entry satisfying Col2 >= 2016 must be labeled in Col3 starting from 1 and all those satisfying Col2 <= 2015 must get 0 as labels in Col3 (if the starting element of Col2 in a group Col1 is 2017 then it gets label 1 in Col3, the next higher element gets label 2 and so on, and all those < = 2015 get 0 as labels):
The following code
Data_Frame <- Data_Frame %>% group_by(Col1) %>% mutate(Col3 = match(Col2, unique(Col2)))
labels everything in the groups as shown by the below snapshot:
And, using the below code
Data_Frame <- Data_Frame %>% group_by(Col1) %>% mutate(Col3 = ifelse(Col2 <=2015, 0, match(Col2, unique(Col2))))
does place a zero for values <= 2015 as shown below, but does not label with 1, 2 etc as expcted for other entries as shown in the expected result above:
What is going wrong in the code?
Upvotes: 0
Views: 56
Reputation: 123
Is this what you want?
Data_Frame <- Data_Frame %>%
group_by(Col1) %>%
mutate(Col3 = ifelse(Col2 <=2015, 0, match(Col2, unique(Col2[Col2 > 2015]))))
In your ifelse, the unique(Col2)
still refers to all entries in Col2
. Thus, you should subset it to the data you still want to include.
Upvotes: 1