Jacob
Jacob

Reputation: 329

R Count instances of victory by group

I have a dataframe 'df' where I want to summarize how many times each 'user' has a higher 'total' value for each head-to-head 'game'. My data frame looks like this:

game user total
1 L 55
1 J 60
2 L 64
2 J 77
3 L 90
3 J 67
4 L 98
4 J 88
5 L 71
5 J 92

The summary would state that L had a higer total in 2 games and J had a higher total in 3 games.

Thank you!

Upvotes: 1

Views: 94

Answers (4)

jared_mamrot
jared_mamrot

Reputation: 26695

Is this the type of output you want?

library(tidyverse)

df <- structure(list(game = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L),
                     user = c("L", "J", "L", "J", "L", "J", "L", "J", "L", "J"), 
                     total = c(55L, 60L, 64L, 77L, 90L, 67L, 98L, 88L, 71L, 92L)
                     ), class = "data.frame", row.names = c(NA, -10L))
df %>%
  group_by(game) %>%
  slice_max(order_by = total,
            n = 1,
            with_ties = TRUE) %>%
  group_by(user) %>%
  summarise(wins = n())
#> # A tibble: 2 × 2
#>   user  wins
#>   <chr> <int>
#> 1 J         3
#> 2 L         2

Created on 2022-01-20 by the reprex package (v2.0.1)

Edit

If you have a draw, then the above method counts that as a 'win' for both users. To count a draw as 'no winner' for both users (e.g. shown in game 1, below), you could use:

library(tidyverse)

df <- structure(list(game = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L),
                     user = c("L", "J", "L", "J", "L", "J", "L", "J", "L", "J"), 
                     total = c(55L, 55L, 64L, 77L, 90L, 67L, 98L, 88L, 71L, 92L)
                     ), class = "data.frame", row.names = c(NA, -10L))

df
#>    game user total
#> 1     1    L    55
#> 2     1    J    55
#> 3     2    L    64
#> 4     2    J    77
#> 5     3    L    90
#> 6     3    J    67
#> 7     4    L    98
#> 8     4    J    88
#> 9     5    L    71
#> 10    5    J    92

df %>%
  group_by(game) %>%
  distinct(total, .keep_all = TRUE) %>%
  filter(n() >= 2) %>%
  slice_max(order_by = total,
            n = 1,
            with_ties = FALSE) %>%
  group_by(user) %>%
  summarise(win = n())
#> # A tibble: 2 × 2
#>   user    win
#>   <chr> <int>
#> 1 J         2
#> 2 L         2

Created on 2022-01-20 by the reprex package (v2.0.1)

Upvotes: 1

jpdugo17
jpdugo17

Reputation: 7116

We can group the data by game, slice_max and then count the resulting data.

library(tidyverse)

df %>% group_by(game) %>%
    slice_max(total) %>% 
    ungroup() %>% 
    count(user)
#> # A tibble: 2 × 2
#>   user      n
#>   <chr> <int>
#> 1 J         3
#> 2 L         2

Created on 2022-01-20 by the reprex package (v2.0.1)

Note that if there's a tie, it will add one to both teams:

library(tidyverse)
df <- 
read_table('game    user    total
1   L   60
1   J   60
2   L   64
2   J   77
3   L   90
3   J   67
4   L   98
4   J   88
5   L   71
5   J   92')

df %>% group_by(game) %>%
    slice_max(total) %>% 
    ungroup() %>% 
    count(user)
#> # A tibble: 2 × 2
#>   user      n
#>   <chr> <int>
#> 1 J         3
#> 2 L         3

Created on 2022-01-20 by the reprex package (v2.0.1)

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28705

Same approach as Vinay, using data.table

library(data.table)
setDT(df)

df[order(total), tail(.SD, 1), game][, .N, user]
#>      user     N
#>    <char> <int>
#> 1:      J     3
#> 2:      L     2

Created on 2022-01-19 by the reprex package (v2.0.1)

Data used:

df <- structure(list(game = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L
), user = c("L", "J", "L", "J", "L", "J", "L", "J", "L", "J"), 
    total = c(55L, 60L, 64L, 77L, 90L, 67L, 98L, 88L, 71L, 92L
    )), row.names = c(NA, -10L), class = "data.frame")

Upvotes: 2

Vinay
Vinay

Reputation: 253

Assuming df is your dataframe the following should give you the long form summary.

df %>% 
  arrange(game,desc(total)) %>%  #we sort descending to ensure winner row is first.
  group_by(game) %>% # we group the rows per game, this allows for winner row to be first in each group
  slice_head(n=1)%>% #get first row in each group i.e winner row
  ungroup()

Output:

# A tibble: 5 × 3
   game user  total
  <int> <chr> <int>
1     1 J        60
2     2 J        77
3     3 L        90
4     4 L        98
5     5 J        92

If you just want the user wise summary add count to the code as follows:

df %>% 
  arrange(game,desc(total)) %>%  #we sort descending to ensure winner row is first.
  group_by(game) %>% # we group the rows per game, this allows for winner row to be first in each group
  slice_head(n=1) %>%  #get first row in each group i.e winner row
  ungroup() %>% 
  count(user)

Output:

# A tibble: 2 × 2
  user      n
  <chr> <int>
1 J         3
2 L         2

Upvotes: 1

Related Questions