Determine overall rank of items grouped by date in tidy data frame

Question

The goal is to determine which company ranks highest by rate over the time period of the data. I thought one way to do this would be to arrange the companies within each date and assign a rank. Then add up the ranks for each company. The company with the lowest value of that sum wins.

df <- tibble(
  comp =  rep(letters[1:3], 4),
  rate = c(1, 1.1, 1.2, 0.9, 1, 1.2, 1, 1.2, 1.4, 1.5, 1.1, 1),
  date = c(rep(Sys.Date()-3, 3), rep(Sys.Date()-2, 3), 
           rep(Sys.Date()-1, 3), rep(Sys.Date(), 3))
            )

Clearly I'm missing something about group_by

df |> group_by(date) |> 
  arrange(rate)

# A tibble: 12 × 3
# Groups:   date [4]
   comp   rate date      
         
 1 a       0.9 2023-02-04
 2 a       1   2023-02-03
 3 b       1   2023-02-04
 4 a       1   2023-02-05
 5 c       1   2023-02-06
 6 b       1.1 2023-02-03
 7 b       1.1 2023-02-06
 8 c       1.2 2023-02-03
 9 c       1.2 2023-02-04
10 b       1.2 2023-02-05
11 c       1.4 2023-02-05
12 a       1.5 2023-02-06

M-- · Accepted Answer

library(dplyr)

df %>% 
  group_by(date) %>% 
  arrange(date, desc(rate)) %>% 
  mutate(rnk = rank(-rate))

#> # A tibble: 12 x 4
#> # Groups:   date [4]
#>    comp   rate date         rnk
#>           
#>  1 c       1.2 2023-02-03     1
#>  2 b       1.1 2023-02-03     2
#>  3 a       1   2023-02-03     3
#>  4 c       1.2 2023-02-04     1
#>  5 b       1   2023-02-04     2
#>  6 a       0.9 2023-02-04     3
#>  7 c       1.4 2023-02-05     1
#>  8 b       1.2 2023-02-05     2
#>  9 a       1   2023-02-05     3
#> 10 a       1.5 2023-02-06     1
#> 11 b       1.1 2023-02-06     2
#> 12 c       1   2023-02-06     3

You can summarise this however you like:

df %>% 
  group_by(date) %>% 
  arrange(date, rate) %>% 
  mutate(rnk = rank(rate)) %>% 
  group_by(comp) %>% 
  summarise(full_rank = sum(rnk))

Determine overall rank of items grouped by date in tidy data frame

Answers (2)

Related Questions