layabout
layabout

Reputation: 123

R sequentially count appearance of value based on multiple conditions

I have a dataframe of sequential chess games:

n_game winner  winner_id white_id. black_id  
1      black   id_2      id_1      id_2
2      white   id_3      id_3      id_4
3      draw    draw      id_4      id_1
4      black   id_2      id_7      id_2
5      draw    draw      id_9      id_10
6      white   id_6      id_6      id_45

I want to add several columns that sequentially count the number of wins, losses, and draws of both players labeled by the color so I get each player's performance so far (before the game) as each color. So for the first row I'd want

n_game winner  winner_id white_id. black_id  n_wins_white  n_wins_white_as_black  n_draws_white ...
1      black   id_2      id_1      id_2       0               0                       0

and for the next game with id_2 as a player later on I would want something like

n_game winner  winner_id white_id. black_id  n_wins_white  n_wins_white_as_black  n_draws_white ...
4      black   id_2      id_7      id_2        1                0                     0

I'm trying to do this in a "neat" fashion, preferably with dplyr. I've tried variations of grouping by the winner_id and the winner color with the black and white ids and getting a sequential count with seq_along() but for example I end up getting the count of each time the white or black id has appeared in the winner column rather than the specific condition. I've also tried with ifelse but that seems to break the sequential count. Any help would be great.

Upvotes: 2

Views: 117

Answers (1)

CSJCampbell
CSJCampbell

Reputation: 2115

lag lets you count to before each round. You'll need to count white and black wins as separate operations. You need to compare the current group with all previous rows, so the whole table must be passed into the mutate, and find the relevant rows with cur_group. That then gives you all the rows, so you need to subset only the rows corresponding to this group with cur_group_rows. It looks a bit neater if we abstract that out into a separate function.

library(dplyr)
get_history <- function(colour = "white", isdraw = FALSE) {
    colour_win <- colour
    if (isdraw) { colour_win <- "draw" }
    lag(cumsum(
        tab$winner == colour_win & 
        tab[[paste0(colour, "_id")]] == cur_group()[[1L]]), default = 0)[seq_len(nrow(tab)) %in% cur_group_rows()]
}
tab %>%  
    group_by(white_id) %>% 
    mutate(
        n_wins_white_as_white = get_history(),
        n_draws_white_as_white = get_history(isdraw = TRUE),
        n_wins_white_as_black = get_history("black")) %>% 
    group_by(black_id) %>% 
    mutate(
        n_wins_black_as_black = get_history("black"),
        n_draws_black_as_black = get_history("black", isdraw = TRUE),
        n_wins_black_as_white = get_history())
# # A tibble: 8 x 11
# # Groups:   black_id [5]
#   n_game winner winner_id white_id black_id n_wins_white_as_whi… n_draws_white_as_wh… n_wins_white_as_bla… n_wins_black_as_bl… n_draws_black_as_b… n_wins_black_as_wh…
#    <int> <chr>  <chr>     <chr>    <chr>                   <dbl>                <dbl>                <dbl>               <dbl>               <dbl>               <dbl>
# 1      1 black  id_2      id_1     id_2                        0                    0                    0                   0                   0                   0
# 2      2 white  id_3      id_3     id_4                        0                    0                    0                   0                   0                   0
# 3      3 draw   draw      id_4     id_1                        0                    0                    0                   0                   0                   0
# 4      4 black  id_2      id_7     id_2                        0                    0                    0                   1                   0                   0
# 5      5 white  id_3      id_3     id_2                        1                    0                    0                   2                   0                   0
# 6      6 white  id_2      id_2     id_4                        0                    0                    2                   0                   0                   0
# 7      7 black  id_3      id_1     id_3                        0                    0                    0                   0                   0                   2
# 8      8 draw   draw      id_4     id_7                        0                    1                    0                   0                   0                   0

Upvotes: 1

Related Questions