Jakub.Novotny
Jakub.Novotny

Reputation: 3057

R dense_rank without ordering

I am looking for something like dense_rank that would disregard the order of the ranked column.

# some data
df <- data.frame(
  cat = c("A", "A", "B", "C", "A"),
  date = seq.Date(from = as.Date("2020-01-01"), length.out = 5, by = "days")
)
# showing the intended order
df$custom_order <- c(1,1,2,3,4)

The intended result is this. The second A is considered as part of the first A. The fifth A is a "new" A because the preceding cat is not an A.

  cat       date custom_order
1   A 2020-01-01            1
2   A 2020-01-02            1
3   B 2020-01-03            2
4   C 2020-01-04            3
5   A 2020-01-05            4

Does a function like this exist? I am aware it can be achieved with some lag() magic, but I was hoping there might be an easier way.

Upvotes: 1

Views: 318

Answers (1)

akrun
akrun

Reputation: 887741

We can use rleid from data.table to update the index whenever the current element is not matchingwith the previous one

library(data.table)
library(dplyr)  
df %>%
    mutate(custom_order = rleid(cat))
#    cat       date custom_order
#1   A 2020-01-01            1
#2   A 2020-01-02            1
#3   B 2020-01-03            2
#4   C 2020-01-04            3
#5   A 2020-01-05            4

In base R, this can be achieved with rle

df$custom_order <-  with(rle(as.character(df$cat)), rep(seq_along(values), lengths))

Upvotes: 4

Related Questions