Reputation: 515
I have a column with date values. I am trying to mark the most recent date as 1 and the most recent-before as 2 and so on.
Here I am using dplyr
to achive this, but after executing the code I am getting constant outcome as '1'
date = c("2020-01-20","2020-01-21","2020-01-22","2020-01-23","2020-01-24")
df = as.data.frame(date)
df$date= as.Date(df$date)
max_date=max(df$date)
min_date =min(df$date)
ll <- seq(min_date, max_date, by = "day")
df$day_no = case_when(
( max(df$date) ==as.Date(ll[5])) ~ as.integer(1),
( max(df$date)-1 ==as.Date(ll[4])) ~ as.integer(2),
( max(df$date)-2 ==as.Date(ll[3])) ~ as.integer(3),
( max(df$date)-3 ==as.Date(ll[2])) ~ as.integer(4),
( max(df$date)-4 ==as.Date(ll[1])) ~ as.integer(5),
TRUE ~ as.integer(NA)
)
The output I am looking for is as below:
day_no = c(5,4,3,2,1)
day_no = as.integer(day_no)
df = data.frame(date,day_no)
Thanks in advance.
Upvotes: 0
Views: 1209
Reputation: 1688
Another way is using xxx_rank
in dplyr or rank
in base r.
df %>%
mutate(day_no = dense_rank(desc(date)))
just pick the one works for your dataset.
Upvotes: 2
Reputation: 21400
In base R, use rank
:
df$daterank <- rank(df$date)
df
date daterank
1 2020-01-20 1
2 2020-01-21 2
3 2020-01-22 3
4 2020-01-23 4
5 2020-01-24 5
Alternatively, if you require inversed ranking, add -
and convert dates to numeric:
df$daterank <- rank(- as.numeric(df$date))
df
date daterank
1 2020-01-20 5
2 2020-01-21 4
3 2020-01-22 3
4 2020-01-23 2
5 2020-01-24 1
Upvotes: 1
Reputation: 76402
In order to solve the problem, convert the dates to their integer representation, then subtract the maximum. To subtract 1L
adjusts for the values of the expected output. Then convert to absolute values.
df$day_no <- abs(as.integer(df$date) - max(as.integer(df$date)) - 1L)
df
# date day_no
#1 2020-01-20 5
#2 2020-01-21 4
#3 2020-01-22 3
#4 2020-01-23 2
#5 2020-01-24 1
Upvotes: 0