R-Randomize column according to IDs in another column

Question

I have a dataframe of several columns, say it looks like the following:

 ID percent region
 1    5       1
 1    8       2
 1    10      3
 1    100     4
 2    20      1
 2    6       2
 2    9       3
 2    1       4
 3    9       1
 3    78      2
 3    56      3
 3    99      4
 4    1       1
 4    1       2
 4    8       3

I need to randomize the "percent" column of the dataset, but the values (and order of the values) need to be the same within the individual's (given by ID) block. The "region" and any other column remains as is, and only "percent" should be randomized. An example can be the following:

 ID percent region
 2    20      1
 2    6       2
 2    9       3
 2    1       4
 4    1       1
 4    1       2
 4    8       3
 1    5       1
 1    8       2
 1    10      3
 1    100     4
 3    9       1
 3    78      2
 3    56      3
 3    99      4

Note that the order of values within IDs of "percent" remains the same.

akrun · Accepted Answer

We can get the distinct 'ID', sample on it, extract the subset of dataset by comparing with each sampled 'ID" and bind it together (map_df)

library(tidyverse)
df1 %>%
  distinct(ID) %>%
  pull(ID) %>%
  sample %>% 
  map_df(~ df1 %>% filter(ID == .x))

Or a faster option would be to split by 'ID', then rearrange the list elements by sampleing on the names of the list and bind the rows (bind_rows)

df1 %>%
   split(.$ID) %>%
   .[sample(names(.))] %>%
   bind_rows

Or we can use base R by using the same methodology as above

lst <- split(df1, df1$ID)
df2 <- do.call(rbind, lst[sample(names(lst))])
row.names(df2) <- NULL
df2
#    ID percent region
#1   4       1      1
#2   4       1      2
#3   4       8      3
#4   3       9      1
#5   3      78      2
#6   3      56      3
#7   3      99      4
#8   2      20      1
#9   2       6      2
#10  2       9      3
#11  2       1      4
#12  1       5      1
#13  1       8      2
#14  1      10      3
#15  1     100      4

R-Randomize column according to IDs in another column

Answers (1)

Related Questions