Reputation: 145
I have a dataset with 400K observations and 250 features. I would like to perform the stratified sampling.
I referred many links, but they are all after 1 or two variables examples including Target.
Can anybody please help me how should be performing stratified sampling using R / Python.
thanks in Adavance !
Upvotes: -1
Views: 429
Reputation: 27762
If you first group your data.frame, you can sample each group using dplyr's sample_n()
library(dplyr)
sample.df <- df %>% group_by( ID ) %>% sample_n( 10 )
Upvotes: 0