Reputation: 19
I have a large sample data of healthcare data called oct
Providers ID date ICD
Billy 4504 9/11 f.11
Billy 5090 9/10 r.05
Max 4430 9/01 k.11
Mindy 0812 9/30 f.11
etc.
I want a random sample of ID numbers for each provider. I have tried.
review <- oct %>% group_by(Providers) %>% do (sample(oct$ID, size = 5, replace= FALSE, prob = NULL))
Upvotes: 1
Views: 1934
Reputation: 5405
Example using dplyr::sample_n
library(dplyr)
set.seed(1)
mtcars %>% group_by(cyl) %>% sample_n(3)
# A tibble: 9 x 11
# Groups: cyl [3]
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
2 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
3 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
4 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
5 21 6 160 110 3.9 2.88 17.0 0 1 4 4
6 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
7 15 8 301 335 3.54 3.57 14.6 0 1 5 8
8 15.5 8 318 150 2.76 3.52 16.9 0 0 3 2
9 14.7 8 440 230 3.23 5.34 17.4 0 0 3 4
If you'd like to just select a specific variable (ID
in your question):
set.seed(1)
mtcars %>%
group_by(cyl) %>%
sample_n(3) %>%
pull(mpg)
[1] 22.8 32.4 33.9 19.7 21.0 19.2 15.0 15.5 14.7
Upvotes: 4