Lisa
Lisa

Reputation: 271

R: Difference between the subsequent ranks of a item group by date

I would like to derive a ranking of ID given the number of records under a grouped date and find the difference between the sequential ranking of the same ID.

I have managed to derive the ranking

data_new <- data_new %>%
  group_by(Date) %>%
  mutate(my_ranks = order(order(Date, NumRecords, decreasing=FALSE)))

Results

Date  NumRecords ID my_ranks    
2019-03 3   L99887  59     
2019-04 1   L99887  25      
2019-05 1   L99887  22      
2019-06 2   L99887  46  
2019-02 6   L98849  52  
2019-04 19  L98849  108 
2019-05 18  L98849  126 
2019-06 18  L98849  116 
2019-07 17  L98849  136 
2019-08 3   L98849  30 

Not sure how to add another column representing the difference of the ID ranking with Date.

Date  NumRecords ID my_ranks  rank_diff   
2019-03 3   L99887  59          0
2019-04 1   L99887  25          34   
2019-05 1   L99887  22          3
2019-06 2   L99887  46         -24
2019-02 6   L98849  52          0
2019-04 19  L98849  108        -56  
2019-05 18  L98849  126        -18
2019-06 18  L98849  116        10
2019-07 17  L98849  136         -26
2019-08 3   L98849  30          106

Upvotes: 2

Views: 47

Answers (1)

akrun
akrun

Reputation: 887291

An option is to do a group by 'ID' and then use the diff

library(dplyr)
data_new %>%
     group_by(ID) %>%
     mutate(Rank_diff = -1 *c(0, diff(my_ranks))) 
# A tibble: 10 x 5
# Groups:   ID [2]
#   Date    NumRecords ID     my_ranks Rank_diff
#   <chr>        <int> <chr>     <int>     <dbl>
# 1 2019-03          3 L99887       59         0
# 2 2019-04          1 L99887       25        34
# 3 2019-05          1 L99887       22         3
# 4 2019-06          2 L99887       46       -24
# 5 2019-02          6 L98849       52         0
# 6 2019-04         19 L98849      108       -56
# 7 2019-05         18 L98849      126       -18
# 8 2019-06         18 L98849      116        10
# 9 2019-07         17 L98849      136       -20
#10 2019-08          3 L98849       30       106

data

data_new <- structure(list(Date = c("2019-03", "2019-04", "2019-05", "2019-06", 
"2019-02", "2019-04", "2019-05", "2019-06", "2019-07", "2019-08"
), NumRecords = c(3L, 1L, 1L, 2L, 6L, 19L, 18L, 18L, 17L, 3L), 
    ID = c("L99887", "L99887", "L99887", "L99887", "L98849", 
    "L98849", "L98849", "L98849", "L98849", "L98849"), my_ranks = c(59L, 
    25L, 22L, 46L, 52L, 108L, 126L, 116L, 136L, 30L)),
       class = "data.frame", row.names = c(NA, 
-10L))

Upvotes: 2

Related Questions