Reputation: 850
This questions is related to the following questions posted by other users:
How to number/label data-table by group-number from group_by?
Parts of my approach are 'inspired' by krlmlr answers to this thread: https://github.com/tidyverse/dplyr/issues/1185
I have a data.frame
similar to the following:
db <- data.frame(ID = c(rep(1, 5), rep(2, 5)),
date = as.Date(c(
rep("2001-01-01", 3),
"2001-01-03",
"2001-01-03",
rep("2011-01-01", 2),
rep("2010-03-12", 2),
"2001-01-01"
)))
db
# ID date
# 1 1 2001-01-01
# 2 1 2001-01-01
# 3 1 2001-01-01
# 4 1 2001-01-03
# 5 1 2001-01-03
# 6 2 2011-01-01
# 7 2 2011-01-01
# 8 2 2010-03-12
# 9 2 2010-03-12
# 10 2 2001-01-01
My goal is to group by ID and date and create a new column in db, indicating an order of the dates within each ID. The solution would be a new column to db with the values c(1, 1, 1, 2, 2, 3, 3, 2, 2, 1)
The two approaches I tried will rank the dates across all IDs but not within each ID (see below).
What can I do?
Thank you very much.
db %>%
group_by(ID, date) %>%
{ mutate(ungroup(.), rank = group_indices(.)) }
## A tibble: 10 x 3
# ID date rank
# <dbl> <date> <int>
# 1 1 2001-01-01 1
# 2 1 2001-01-01 1
# 3 1 2001-01-01 1
# 4 1 2001-01-03 2
# 5 1 2001-01-03 2
# 6 2 2011-01-01 5
# 7 2 2011-01-01 5
# 8 2 2010-03-12 4
# 9 2 2010-03-12 4
#10 2 2001-01-01 3
db %>%
mutate(label = group_indices(., ID, date))
# ID date label
#1 1 2001-01-01 1
#2 1 2001-01-01 1
#3 1 2001-01-01 1
#4 1 2001-01-03 2
#5 1 2001-01-03 2
#6 2 2011-01-01 5
#7 2 2011-01-01 5
#8 2 2010-03-12 4
#9 2 2010-03-12 4
#10 2 2001-01-01 3
Upvotes: 0
Views: 307
Reputation: 39154
We can use dense_rank
.
library(dplyr)
db2 <- db %>%
group_by(ID) %>%
mutate(rank = dense_rank(date)) %>%
ungroup()
db2
# # A tibble: 10 x 3
# ID date rank
# <dbl> <date> <int>
# 1 1. 2001-01-01 1
# 2 1. 2001-01-01 1
# 3 1. 2001-01-01 1
# 4 1. 2001-01-03 2
# 5 1. 2001-01-03 2
# 6 2. 2011-01-01 3
# 7 2. 2011-01-01 3
# 8 2. 2010-03-12 2
# 9 2. 2010-03-12 2
# 10 2. 2001-01-01 1
Upvotes: 1