crumblycloth
crumblycloth

Reputation: 75

How to add a column that ranks values?

I asked a similar question some time ago but then subsequently realized that my problem was in fact more complex. Apologies for asking again.

df <- data.frame(
  comp_name = c("A","A","B","B","A","A","B","B","C","C","D","D","C","C","D","D"),
  country = c("US","US","US","US","US","US","US","US","France","France","France","France","France","France","France","France"),
  year = c("2018","2018","2018","2018","2019","2019","2019","2019","2018","2018","2018","2018","2019","2019","2019","2019"),
  type = c("profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue"),
  value = c(10,20,30,40,20,30,40,50,140,150,120,130,100,110,80,90)
)

df:

   comp_name country year    type value
1          A      US 2018  profit    10
2          A      US 2018 revenue    20
3          B      US 2018  profit    30
4          B      US 2018 revenue    40
5          A      US 2019  profit    20
6          A      US 2019 revenue    30
7          B      US 2019  profit    40
8          B      US 2019 revenue    50
9          C  France 2018  profit   140
10         C  France 2018 revenue   150
11         D  France 2018  profit   120
12         D  France 2018 revenue   130
13         C  France 2019  profit   100
14         C  France 2019 revenue   110
15         D  France 2019  profit    80
16         D  France 2019 revenue    90

I want to add a rank column like so:

   comp_name country year    type value rank
1          A      US 2018  profit    10     
2          A      US 2018 revenue    20     
3          B      US 2018  profit    30     
4          B      US 2018 revenue    40     
5          A      US 2019  profit    20    2
6          A      US 2019 revenue    30     
7          B      US 2019  profit    40    1
8          B      US 2019 revenue    50     
9          C  France 2018  profit   140     
10         C  France 2018 revenue   150     
11         D  France 2018  profit   120     
12         D  France 2018 revenue   130     
13         C  France 2019  profit   100    1
14         C  France 2019 revenue   110     
15         D  France 2019  profit    80    2
16         D  France 2019 revenue    90     

I want to consider only profits in 2019 and rank the companies by their profits in each country.

When I previously asked the question, @KarthikS provided the following solution:

library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))

However, I've now added more variables (year and type), which I also want to consider.

Please do let me know if the question is unclear. I am new to R, and any help would be greatly appreciated. Thank you!

Upvotes: 0

Views: 53

Answers (1)

r2evans
r2evans

Reputation: 160447

Calculate the ranks for all years, all types, all years, and then remove the values you don't need. (Or keep them.)

library(dplyr)
df %>%
  group_by(country, year, type) %>%
  mutate(rank = rank(desc(value))) %>%
  ungroup() %>%
  mutate(rank = if_else(year == 2019 & type == "profit", rank, NA_real_))
# # A tibble: 16 x 6
#    comp_name country year  type    value  rank
#    <chr>     <chr>   <chr> <chr>   <dbl> <dbl>
#  1 A         US      2018  profit     10    NA
#  2 A         US      2018  revenue    20    NA
#  3 B         US      2018  profit     30    NA
#  4 B         US      2018  revenue    40    NA
#  5 A         US      2019  profit     20     2
#  6 A         US      2019  revenue    30    NA
#  7 B         US      2019  profit     40     1
#  8 B         US      2019  revenue    50    NA
#  9 C         France  2018  profit    140    NA
# 10 C         France  2018  revenue   150    NA
# 11 D         France  2018  profit    120    NA
# 12 D         France  2018  revenue   130    NA
# 13 C         France  2019  profit    100     1
# 14 C         France  2019  revenue   110    NA
# 15 D         France  2019  profit     80     2
# 16 D         France  2019  revenue    90    NA

Upvotes: 1

Related Questions