Reputation: 79
I have a data frame that I am attempting to filter and remove some of the data. That df looks like this:
Event Name Team Rank
1 Mike B 1
1 Joe A 2
1 Tom C 3
1 Bill B 4
2 Joe A 1
2 Tom C 2
...
I am trying to filter the data so I only have 3 events per person (by their best rank) and 18 people per team.
I was able to get 3 events per person using:
df <- df %>%
group_by(Name) %>%
top_n(-3,Rank)
but the 18 people per team is tripping me up. Do I need to group_by
Team and Name? If so, how? Everything I've tried hasn't worked.
Also, I would prefer to not have ties but that is minor right now.
Edit: this is a large df but here is the structure:
structure(list(event = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), name = structure(c(22L,
16L, 28L, 27L, 17L, 21L, 3L, 2L, 8L, 13L, 15L, 28L, 5L, 16L,
17L, 2L, 22L, 3L, 10L, 21L, 5L, 15L, 24L, 29L, 1L, 2L, 18L, 25L,
7L, 21L, 29L, 19L, 25L, 18L, 9L, 23L, 14L, 4L, 29L, 6L, 29L,
19L, 9L, 26L, 25L, 14L, 4L, 11L, 20L, 12L), .Label = c("Andreas",
"Andrej", "Blaise", "Brendan", "Coleman", "Colton", "Cooper",
"Corben", "Eric", "Giovanni", "Graham", "Hayden", "Ian", "Jack",
"Jacob", "Justin", "Kanoa", "Lane", "Marcelo", "Matthew", "Miles",
"Nyls", "Robby", "Rodrigo", "Sadler", "T.C.", "Thomas", "Will",
"Zach"), class = "factor"), team = structure(c(1L, 1L, 2L, 3L,
2L, 4L, 5L, 6L, 7L, 3L, 1L, 2L, 1L, 1L, 2L, 6L, 1L, 5L, 1L, 4L,
1L, 1L, 7L, 9L, 1L, 6L, 3L, 9L, 8L, 4L, 9L, 6L, 9L, 3L, 1L, 8L,
1L, 8L, 6L, 7L, 9L, 6L, 1L, 6L, 9L, 1L, 8L, 6L, 8L, 6L), .Label = c("A",
"B", "C", "D", "E", "F", "G", "H", "J"), class = "factor"), rank = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 8L, 10L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L)), class = "data.frame", row.names = c(NA, -50L
))
Upvotes: 0
Views: 295
Reputation: 13309
This?
library(tidyverse)
df %>%
arrange(team,desc(rank)) %>%
group_by(event,team) %>%
top_n(3,rank)
Current Output:
event name team rank
<int> <fct> <fct> <int>
1 2 Giovanni A 9
2 2 Nyls A 7
3 4 Jack A 7
4 6 Jack A 6
5 3 Andreas A 5
6 4 Eric A 5
7 2 Justin A 4
8 6 Eric A 3
9 1 Justin A 2
10 3 Jacob A 2
test:
df %>%
arrange(team,desc(rank)) %>%
group_by(name,team) %>%
top_n(3,rank) %>%
filter(name=="Justin")
event name team rank
<int> <fct> <fct> <int>
1 2 Justin A 4
2 1 Justin A 2
Upvotes: 0
Reputation: 2101
Something like this should work
df %>%
group_by(name, team) %>%
filter(row_number() <= 18)
@NelsonGon comment advised to group by both at once, which appears to give the exact results in a more concise way.
Upvotes: 1