Reputation: 903
I'm trying to obtain the top 2 names, sorted alphabetically, per group. I would think that top_n()
would select this after I perform a group_by. However, this does not seem to be the case. This code shows the problem.
df <- data.frame(Group = c(0, 0, 0, 1, 1, 1),
Name = c("a", "c", "b", "e", "d", "f"))
df <- df %>%
arrange(Name, Group) %>%
group_by(Group) %>%
top_n(2)
df
# A tibble: 2 x 2
# Groups: Group [1]
Group Name
<dbl> <chr>
1 1 e
2 1 f
Expected output would be:
df <- df %>%
arrange(Name, Group) %>%
group_by(Group) %>%
top_n(2)
df
Group Name
1 0 a
2 0 b
3 1 d
4 1 e
Or something similar. Thanks.
Upvotes: 0
Views: 577
Reputation: 886938
We can use
library(dplyr)
df %>%
arrange(Group, Name) %>%
group_by(Group) %>%
filter(row_number() < 3)
Upvotes: 0
Reputation: 388817
top_n
selects top n
max values. You seem to need top n
min values. You can use index with negative values to get that. Additionaly you don't need to arrange
the data when using top_n
.
library(dplyr)
df %>% group_by(Group) %>% top_n(-2, Name)
# Group Name
# <dbl> <chr>
#1 0 a
#2 0 b
#3 1 e
#4 1 d
Another way is to arrange
the data and select first two rows in each group.
df %>% arrange(Group, Name) %>% group_by(Group) %>% slice(1:2)
Upvotes: 1