Reputation: 123
I have a dataframe of sequential chess games:
n_game winner winner_id white_id. black_id
1 black id_2 id_1 id_2
2 white id_3 id_3 id_4
3 draw draw id_4 id_1
4 black id_2 id_7 id_2
5 draw draw id_9 id_10
6 white id_6 id_6 id_45
I want to add several columns that sequentially count the number of wins, losses, and draws of both players labeled by the color so I get each player's performance so far (before the game) as each color. So for the first row I'd want
n_game winner winner_id white_id. black_id n_wins_white n_wins_white_as_black n_draws_white ...
1 black id_2 id_1 id_2 0 0 0
and for the next game with id_2 as a player later on I would want something like
n_game winner winner_id white_id. black_id n_wins_white n_wins_white_as_black n_draws_white ...
4 black id_2 id_7 id_2 1 0 0
I'm trying to do this in a "neat" fashion, preferably with dplyr. I've tried variations of grouping by the winner_id and the winner color with the black and white ids and getting a sequential count with seq_along() but for example I end up getting the count of each time the white or black id has appeared in the winner column rather than the specific condition. I've also tried with ifelse but that seems to break the sequential count. Any help would be great.
Upvotes: 2
Views: 117
Reputation: 2115
lag
lets you count to before each round. You'll need to count white and black wins as separate operations. You need to compare the current group with all previous rows, so the whole table must be passed into the mutate
, and find the relevant rows with cur_group
. That then gives you all the rows, so you need to subset only the rows corresponding to this group with cur_group_rows
. It looks a bit neater if we abstract that out into a separate function.
library(dplyr)
get_history <- function(colour = "white", isdraw = FALSE) {
colour_win <- colour
if (isdraw) { colour_win <- "draw" }
lag(cumsum(
tab$winner == colour_win &
tab[[paste0(colour, "_id")]] == cur_group()[[1L]]), default = 0)[seq_len(nrow(tab)) %in% cur_group_rows()]
}
tab %>%
group_by(white_id) %>%
mutate(
n_wins_white_as_white = get_history(),
n_draws_white_as_white = get_history(isdraw = TRUE),
n_wins_white_as_black = get_history("black")) %>%
group_by(black_id) %>%
mutate(
n_wins_black_as_black = get_history("black"),
n_draws_black_as_black = get_history("black", isdraw = TRUE),
n_wins_black_as_white = get_history())
# # A tibble: 8 x 11
# # Groups: black_id [5]
# n_game winner winner_id white_id black_id n_wins_white_as_whi… n_draws_white_as_wh… n_wins_white_as_bla… n_wins_black_as_bl… n_draws_black_as_b… n_wins_black_as_wh…
# <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 black id_2 id_1 id_2 0 0 0 0 0 0
# 2 2 white id_3 id_3 id_4 0 0 0 0 0 0
# 3 3 draw draw id_4 id_1 0 0 0 0 0 0
# 4 4 black id_2 id_7 id_2 0 0 0 1 0 0
# 5 5 white id_3 id_3 id_2 1 0 0 2 0 0
# 6 6 white id_2 id_2 id_4 0 0 2 0 0 0
# 7 7 black id_3 id_1 id_3 0 0 0 0 0 2
# 8 8 draw draw id_4 id_7 0 1 0 0 0 0
Upvotes: 1