Reputation: 75
I asked a similar question some time ago but then subsequently realized that my problem was in fact more complex. Apologies for asking again.
df <- data.frame(
comp_name = c("A","A","B","B","A","A","B","B","C","C","D","D","C","C","D","D"),
country = c("US","US","US","US","US","US","US","US","France","France","France","France","France","France","France","France"),
year = c("2018","2018","2018","2018","2019","2019","2019","2019","2018","2018","2018","2018","2019","2019","2019","2019"),
type = c("profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue"),
value = c(10,20,30,40,20,30,40,50,140,150,120,130,100,110,80,90)
)
df:
comp_name country year type value
1 A US 2018 profit 10
2 A US 2018 revenue 20
3 B US 2018 profit 30
4 B US 2018 revenue 40
5 A US 2019 profit 20
6 A US 2019 revenue 30
7 B US 2019 profit 40
8 B US 2019 revenue 50
9 C France 2018 profit 140
10 C France 2018 revenue 150
11 D France 2018 profit 120
12 D France 2018 revenue 130
13 C France 2019 profit 100
14 C France 2019 revenue 110
15 D France 2019 profit 80
16 D France 2019 revenue 90
I want to add a rank column like so:
comp_name country year type value rank
1 A US 2018 profit 10
2 A US 2018 revenue 20
3 B US 2018 profit 30
4 B US 2018 revenue 40
5 A US 2019 profit 20 2
6 A US 2019 revenue 30
7 B US 2019 profit 40 1
8 B US 2019 revenue 50
9 C France 2018 profit 140
10 C France 2018 revenue 150
11 D France 2018 profit 120
12 D France 2018 revenue 130
13 C France 2019 profit 100 1
14 C France 2019 revenue 110
15 D France 2019 profit 80 2
16 D France 2019 revenue 90
I want to consider only profits in 2019 and rank the companies by their profits in each country.
When I previously asked the question, @KarthikS provided the following solution:
library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))
However, I've now added more variables (year and type), which I also want to consider.
Please do let me know if the question is unclear. I am new to R, and any help would be greatly appreciated. Thank you!
Upvotes: 0
Views: 53
Reputation: 160447
Calculate the ranks for all years, all types, all years, and then remove the values you don't need. (Or keep them.)
library(dplyr)
df %>%
group_by(country, year, type) %>%
mutate(rank = rank(desc(value))) %>%
ungroup() %>%
mutate(rank = if_else(year == 2019 & type == "profit", rank, NA_real_))
# # A tibble: 16 x 6
# comp_name country year type value rank
# <chr> <chr> <chr> <chr> <dbl> <dbl>
# 1 A US 2018 profit 10 NA
# 2 A US 2018 revenue 20 NA
# 3 B US 2018 profit 30 NA
# 4 B US 2018 revenue 40 NA
# 5 A US 2019 profit 20 2
# 6 A US 2019 revenue 30 NA
# 7 B US 2019 profit 40 1
# 8 B US 2019 revenue 50 NA
# 9 C France 2018 profit 140 NA
# 10 C France 2018 revenue 150 NA
# 11 D France 2018 profit 120 NA
# 12 D France 2018 revenue 130 NA
# 13 C France 2019 profit 100 1
# 14 C France 2019 revenue 110 NA
# 15 D France 2019 profit 80 2
# 16 D France 2019 revenue 90 NA
Upvotes: 1