ixodid
ixodid

Reputation: 2400

Determine overall rank of items grouped by date in tidy data frame

The goal is to determine which company ranks highest by rate over the time period of the data. I thought one way to do this would be to arrange the companies within each date and assign a rank. Then add up the ranks for each company. The company with the lowest value of that sum wins.

df <- tibble(
  comp =  rep(letters[1:3], 4),
  rate = c(1, 1.1, 1.2, 0.9, 1, 1.2, 1, 1.2, 1.4, 1.5, 1.1, 1),
  date = c(rep(Sys.Date()-3, 3), rep(Sys.Date()-2, 3), 
           rep(Sys.Date()-1, 3), rep(Sys.Date(), 3))
            )

Clearly I'm missing something about group_by

df |> group_by(date) |> 
  arrange(rate)

# A tibble: 12 × 3
# Groups:   date [4]
   comp   rate date      
   <chr> <dbl> <date>    
 1 a       0.9 2023-02-04
 2 a       1   2023-02-03
 3 b       1   2023-02-04
 4 a       1   2023-02-05
 5 c       1   2023-02-06
 6 b       1.1 2023-02-03
 7 b       1.1 2023-02-06
 8 c       1.2 2023-02-03
 9 c       1.2 2023-02-04
10 b       1.2 2023-02-05
11 c       1.4 2023-02-05
12 a       1.5 2023-02-06

Upvotes: 1

Views: 47

Answers (2)

asaei
asaei

Reputation: 521

Is this what you want :

df %>%
  group_by(date) %>%
  mutate(rank = dense_rank(rate)) %>%
  ungroup() %>%
  group_by(comp) %>%
  summarise(rank = sum(rank))

output

  comp   rank
  <chr> <int>
1 a         6
2 b         8
3 c        10

Upvotes: 0

M--
M--

Reputation: 28955

library(dplyr)

df %>% 
  group_by(date) %>% 
  arrange(date, desc(rate)) %>% 
  mutate(rnk = rank(-rate))

#> # A tibble: 12 x 4
#> # Groups:   date [4]
#>    comp   rate date         rnk
#>    <chr> <dbl> <date>     <dbl>
#>  1 c       1.2 2023-02-03     1
#>  2 b       1.1 2023-02-03     2
#>  3 a       1   2023-02-03     3
#>  4 c       1.2 2023-02-04     1
#>  5 b       1   2023-02-04     2
#>  6 a       0.9 2023-02-04     3
#>  7 c       1.4 2023-02-05     1
#>  8 b       1.2 2023-02-05     2
#>  9 a       1   2023-02-05     3
#> 10 a       1.5 2023-02-06     1
#> 11 b       1.1 2023-02-06     2
#> 12 c       1   2023-02-06     3

You can summarise this however you like:

df %>% 
  group_by(date) %>% 
  arrange(date, rate) %>% 
  mutate(rnk = rank(rate)) %>% 
  group_by(comp) %>% 
  summarise(full_rank = sum(rnk))

Upvotes: 1

Related Questions