Reputation: 3057
I am looking for something like dense_rank that would disregard the order of the ranked column.
# some data
df <- data.frame(
cat = c("A", "A", "B", "C", "A"),
date = seq.Date(from = as.Date("2020-01-01"), length.out = 5, by = "days")
)
# showing the intended order
df$custom_order <- c(1,1,2,3,4)
The intended result is this. The second A is considered as part of the first A. The fifth A is a "new" A because the preceding cat is not an A.
cat date custom_order
1 A 2020-01-01 1
2 A 2020-01-02 1
3 B 2020-01-03 2
4 C 2020-01-04 3
5 A 2020-01-05 4
Does a function like this exist? I am aware it can be achieved with some lag() magic, but I was hoping there might be an easier way.
Upvotes: 1
Views: 318
Reputation: 887741
We can use rleid
from data.table
to update the index whenever the current element is not matchingwith the previous one
library(data.table)
library(dplyr)
df %>%
mutate(custom_order = rleid(cat))
# cat date custom_order
#1 A 2020-01-01 1
#2 A 2020-01-02 1
#3 B 2020-01-03 2
#4 C 2020-01-04 3
#5 A 2020-01-05 4
In base R
, this can be achieved with rle
df$custom_order <- with(rle(as.character(df$cat)), rep(seq_along(values), lengths))
Upvotes: 4