Thomas Moore
Thomas Moore

Reputation: 981

R How To Perform "Specific" Sampling

I have a dataset that essentially shows baseball players and their positions:

*Player       Position*
John Smith       P
Fred Smith      1B
Al Johnson      2B

And so on. I would like to randomly sample 9 players at a time from this dataset which one just uses R's sample() function. However, from each sampling I would only like to obtain ONE of each position, i.e., 1P, 1 1B, 1 2B, and so on...

How could I do this?

Thanks.

Upvotes: 0

Views: 46

Answers (2)

Rich Scriven
Rich Scriven

Reputation: 99371

You haven't given us much information here, but I would group the data by position then take a sample of 1 from each group. Using data.table would be my go-to for this

library(data.table)
setDT(data)

data[, sample(Player, 1), by = Position]

However, in baseball, outfielders are typically all grouped together as one position - "OF". In this case you would have to sample "OF" 3 times and all the others just 1. You could use an if() statement in the size argument for this scenario.

data[, sample(Player, if(Position == "OF") 3 else 1), by = Position]

Upvotes: 1

dave-edison
dave-edison

Reputation: 3736

Here is a solution using dplyr, assuming df is your dataframe:

library(dplyr)

df %>% 
    group_by(Position) %>% 
    sample_n(1) %>% 
    ungroup() %>% 
    sample_n(9)

Upvotes: 2

Related Questions