Reputation: 455
I am currently trying to rank column values within groups of IDs in the order that they occur. My dataset currently looks like this:
ID Value Date
1 1 a 1/1/2019
2 1 b 2/5/2018
3 1 a 3/3/2019
4 2 a 6/12/1975
5 2 b 5/4/2017
6 2 b 12/3/2016
7 3 c 1/3/2015
8 3 a 2/1/2015
9 4 a 1/1/1991
and I would like to add another column so that it looks like this:
ID Value Date Occurence
1 1 a 1/1/2019 1
2 1 b 2/5/2018 1
3 1 a 3/3/2019 2
4 2 a 6/12/1975 1
5 2 b 5/4/2017 2
6 2 b 12/3/2016 1
7 3 c 1/3/2015 1
8 3 a 2/1/2015 1
9 4 a 1/1/1991 1
So we can see, where ID
is equal to 1
, a
occurs twice and is ranked in order from oldest to newest and b
is only ranked once because it only occurs once for ID
1
.
I have come across code to rank things in order of occurrence for the whole data set, but I would like to rank things in order of occurrence within IDs. There is also code to help me rank all dates within an ID...but I want to be able to look at individual values.
How would I do this? Thank you!
Upvotes: 0
Views: 413
Reputation: 2141
For the updated question with sorting by date:
library(lubridate)
df %>%
mutate_at("Date",dmy) %>% #convert to date-object
arrange(Date) %>% #sort by date
group_by(ID,Value) %>% #group by ID-Value pairs
mutate(Occurence = row_number()) #mutate new column with occurence
df
ID Value Date Occurence
<int> <chr> <date> <int>
1 2 a 1975-12-06 1
2 4 a 1991-01-01 1
3 3 a 2015-01-02 1
4 3 c 2015-03-01 1
5 2 b 2016-03-12 1
6 2 b 2017-04-05 2
7 1 b 2018-05-02 1
8 1 a 2019-01-01 1
9 1 a 2019-03-03 2
Upvotes: 2