Reputation: 520
Every time a player changes I need subtotals of how many strikouts he had in his career.
I have tried doing it using the code below but was not getting subtotals.
player <- c('acostma01', 'acostma01', 'acostma01', 'adkinjo01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01')
year <- c(2010,2011,2012,2007,1985,1986,1987,1988,1989)
games <- c(41,44,45,1,21,28,18,11,36)
strikeouts <- c(42,46,46,0,74,104,77,16,80)
bb_data <- data.frame(player, year, games, strikeouts, stringsAsFactors = FALSE)
Here is code that did not work.
mets <- select(bb_data, player, year, games, strikeouts) %>%
group_by(player, year) %>%
colSums(SO)
Here is the output I would like to get:
player games strikeouts
acostma01 130 134
adkinjo01 1 0
aguilri01 0 351
Grand Total 485
Here is what I was getting (tail of data):
player team year games strikouts
<chr> <chr> <int> <int> <int>
swarzan01 NYN 2018 29 31
syndeno01 NYN 2018 25 155
vargaja01 NYN 2018 20 84
wahlbo01 NYN 2018 7 7
wheelza01 NYN 2018 29 179
zamorda01 NYN 2018 16 16
Upvotes: 0
Views: 47
Reputation: 744
If you don't care about the year column begin summed, you can do that:
library(data.table)
data = setDT(bb_data)[, c(lapply(.SD, sum), .N), by =player]
.N
allows you to count the number of rows by player (number of years).
Then you can order it (with a -
to get it decreasing):
data[order(-data$strikeouts)]
You get this result:
1: aguilri01 9935 114 351 5
2: acostma01 6033 130 134 3
3: adkinjo01 2007 1 0 1
Upvotes: 1
Reputation: 14764
You could do:
library(tidyverse)
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
This would give you:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 acostma01 130 134
2 adkinjo01 1 0
3 aguilri01 114 351
4 Grand Total NA 485
Which is consistent with all values except games
for aguilri01
- I presume it is a typo, but let me know if this is incorrect.
For descending order, you could do:
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
Output:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 aguilri01 114 351
2 acostma01 130 134
3 adkinjo01 1 0
4 Grand Total NA 485
To also include the seasons played, you can try:
bb_data %>%
group_by(player) %>%
mutate(seasons_played = n_distinct(year)) %>%
group_by(player, seasons_played) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
ungroup() %>%
add_row(player = 'Grand Total', games = NA, seasons_played = NA, strikeouts = sum(.$strikeouts))
Upvotes: 2