Ray
Ray

Reputation: 351

Label conditionally in a group in a data frame R

I am trying to create a new variable, Col3, conditionally and label it for all those entries which are >= 2016 for the below data frame:

Data_Frame <- data.frame(Col1 = c("A1", "A1", "A1", "A1", "A1", "A1", "A1", "A2", "A2", "A2", "A3", "A3", "A3", "A3", "A3"), 
                         
                         Col2 = c(2010, 2016, 2019, 2019, 2020, 2020, 2020, 2017, 2018, 2021, 2002, 2009, 2020, 2020, 2021))

The expected result is shown below in the table as a pic, where Col3 is calculated based on groups in Col1 such that any entry satisfying Col2 >= 2016 must be labeled in Col3 starting from 1 and all those satisfying Col2 <= 2015 must get 0 as labels in Col3 (if the starting element of Col2 in a group Col1 is 2017 then it gets label 1 in Col3, the next higher element gets label 2 and so on, and all those < = 2015 get 0 as labels):

enter image description here

The following code

Data_Frame <- Data_Frame %>% group_by(Col1) %>% mutate(Col3 = match(Col2, unique(Col2)))

labels everything in the groups as shown by the below snapshot:

enter image description here

And, using the below code

Data_Frame <- Data_Frame %>% group_by(Col1) %>% mutate(Col3 = ifelse(Col2 <=2015, 0, match(Col2, unique(Col2))))

does place a zero for values <= 2015 as shown below, but does not label with 1, 2 etc as expcted for other entries as shown in the expected result above:

enter image description here

What is going wrong in the code?

Upvotes: 0

Views: 56

Answers (1)

Timon
Timon

Reputation: 123

Is this what you want?

Data_Frame <- Data_Frame %>% 
 group_by(Col1) %>% 
 mutate(Col3 = ifelse(Col2 <=2015, 0, match(Col2, unique(Col2[Col2 > 2015]))))

In your ifelse, the unique(Col2) still refers to all entries in Col2. Thus, you should subset it to the data you still want to include.

Upvotes: 1

Related Questions