Reputation: 981
I have a dataset that essentially shows baseball players and their positions:
*Player Position*
John Smith P
Fred Smith 1B
Al Johnson 2B
And so on. I would like to randomly sample 9 players at a time from this dataset which one just uses R's sample() function. However, from each sampling I would only like to obtain ONE of each position, i.e., 1P, 1 1B, 1 2B, and so on...
How could I do this?
Thanks.
Upvotes: 0
Views: 46
Reputation: 99371
You haven't given us much information here, but I would group the data by position then take a sample of 1 from each group. Using data.table would be my go-to for this
library(data.table)
setDT(data)
data[, sample(Player, 1), by = Position]
However, in baseball, outfielders are typically all grouped together as one position - "OF"
. In this case you would have to sample "OF"
3 times and all the others just 1. You could use an if()
statement in the size
argument for this scenario.
data[, sample(Player, if(Position == "OF") 3 else 1), by = Position]
Upvotes: 1
Reputation: 3736
Here is a solution using dplyr, assuming df
is your dataframe:
library(dplyr)
df %>%
group_by(Position) %>%
sample_n(1) %>%
ungroup() %>%
sample_n(9)
Upvotes: 2