Reputation: 2867
df <- data.frame(
company = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "o", "p"),
EUR = c(1000, 700, 200, 90, 120, 200, 90, 150, 120, 210, 100, 120, 200, 50, 70)
)
df <- df %>%
mutate(company = as.character(company)) %>%
mutate(company = ifelse(row_number() > 10, "others", company)) %>%
mutate(company = as.factor(company)) %>%
group_by(company) %>%
summarise(EUR = sum(EUR, na.rm = TRUE)) %>%
arrange(desc(EUR))
df
# A tibble: 11 x 2
company EUR
<fct> <dbl>
1 a 1000
2 b 700
3 others 540
4 j 210
5 c 200
6 f 200
7 h 150
8 e 120
9 i 120
10 d 90
11 g 90
I have this pretty common task. I want to get the top10 company by spendings and summarise the other companies to "other". I know there is an option to manually reorder the rows by changing it to a factor variable and then reordering the levels, but this doesn't work because others can always be in a different place and I have to do this operation for many markets in many different countries. So "Others" should always be at the last position, no matter which line the category is in. How do I do that?
Upvotes: 3
Views: 263
Reputation: 40171
You can also try:
df %>%
arrange(company == "others", desc(EUR))
company EUR
<fct> <dbl>
1 a 1000
2 b 700
3 j 210
4 c 200
5 f 200
6 h 150
7 e 120
8 i 120
9 d 90
10 g 90
11 others 540
Upvotes: 4
Reputation: 389325
You could find out the row number of "others"
and rearrange using slice
library(dplyr)
df %>%
slice({i <- which.max(company == "others"); c(setdiff(seq_len(n()), i), i)})
# company EUR
# <fct> <dbl>
# 1 a 1000
# 2 b 700
# 3 j 210
# 4 c 200
# 5 f 200
# 6 h 150
# 7 e 120
# 8 i 120
# 9 d 90
#10 g 90
#11 others 540
Same logic in base R would be
i <- which.max(df$company == "others")
df[c(setdiff(seq_len(nrow(df)), i), i), ]
Upvotes: 1