Reputation: 29
I have the following data set:
Name Make_Miss Half
Player A 1 1
Player B 1 1
Player A 0 2
Player A 0 1
Player A 1 1
Player B 0 2
Where Name is the player's name, Make_Miss is whether or not the player made that shot, and Half is which half the shot took place. I am currently using the following code in order to compute the first half made shots count.
Code:
dt[ , Player_First_Made := .N, by = list(dt$Name == "Player A" & dt$Half == 1 & dt$Make_Miss == 1)]
The output:
Name Make_Miss Half Player_First_Made
Player A 1 1 2
Player B 1 1 4
Player A 0 2 4
Player A 0 1 4
Player A 1 1 2
Player B 0 2 4
What is happening here is that wherever Player A has the input of 0 in the Make_Miss column, then the respective row in the Player_First_Made column gets assigned the value of the count of shots that do not match criteria in the list (i.e. Name != Player A or Half != 1 or Make_Miss != 1); however, my desire is the following:
Name Make_Miss Half Player_First_Made
Player A 1 1 2
Player B 1 1 4
Player A 0 2 2
Player A 0 1 2
Player A 1 1 2
Player B 0 2 4
I want the rows that match Name = Player A to always have the value of however many shots they made in the first half. Is there some sort of syntax for data.table that I can specify this assignment?
Upvotes: 0
Views: 789
Reputation: 76
A data.table way to do this would be:
dat[Half == 1, .(Player_First_Made = sum(Make_Miss)), .(Name)
][dat, on = c('Name')]
Where the first line counts the number of times (sum(Make_Miss)
) each player (.(Name)
) made a shot in the first half (Half == 1
).
The second line joins the resulting aggregated table from the step above back into the original dataset.
Here's the sample data I used:
dat <-
data.table(
Name = c('A', 'B'),
Make_Miss = round(runif(30, 0, 1)),
Half = round(runif(30, 1, 2))
)
Upvotes: 1
Reputation: 2510
As @chinsoon12 points out, the data you have provided don't really make sense. However, here is a method using dplyr which I think will give you what you want...
library(dplyr)
# Make some data
DATA <- data.frame(Name = c("Player A", "Player B", "Player C",
"Player A", "Player A", "Player B"), Make_Miss = c(1,1,0,0,1,0),
Half = c(1,1,2,1,2,2))
# Use dplyr to calculate the sums of 'Half' for each player
OUT <- DATA %>% group_by(Name) %>% mutate(Player_First_Made = sum(Half))
# Check the output
> OUT
# A tibble: 6 x 4
# Groups: Name [3]
Name Make_Miss Half Player_First_Made
<fct> <dbl> <dbl> <dbl>
1 Player A 1 1 4
2 Player B 1 1 3
3 Player C 0 2 2
4 Player A 0 1 4
5 Player A 1 2 4
6 Player B 0 2 3
If this isn't what you are looking for, then please edit your question to make it clearer.
Upvotes: 1