chazmatazz
chazmatazz

Reputation: 133

Shuffling a dataframe in R with no repeats

Would anyone happen to know how to script a shuffle of a dataset in R, such that if I have 25 numbers (5 rows x 5 columns) in a dataframe, and I shuffle 25 separate times, each number appears in each location exactly one time?

Thus it's not entirely random, at least not after the first shuffle, as the potential locations of any number decrease with each shuffle.

Thank you!

Upvotes: 0

Views: 367

Answers (2)

Iaroslav Domin
Iaroslav Domin

Reputation: 2718

I'll demonstrate the solution on 3 by 3 datasets. First thing I would do is convert the data.frame to matrix to be able to easily apply permutations.

Let's say we have a 3x3 matrix:

set.seed(1)
m <- matrix(sample(1:100, 9), nrow = 3)
m
#>      [,1] [,2] [,3]
#> [1,]   68   34   14
#> [2,]   39   87   82
#> [3,]    1   43   59

Then each shuffle can be defined by a permutation of numbers 1 to 9.

shuffle <- c(9, 4, 7, 1, 8, 3, 2, 5, 6)
matrix(m[shuffle], nrow = 3)
#>      [,1] [,2] [,3]
#> [1,]   59   68   39
#> [2,]   34   82   87
#> [3,]   14    1   43

So our task then is to generate 9 such permutations where each number occurs on each position exatly once. E.g. having first shuffle c(9, 4, 7, 1, 8, 3, 2, 5, 6), we can't have c(9, 2, 7, 3, 8, 5, 4, 6, 1) then because 9 has already been on the first place, 7 on the third and 8 on the fifth.

Basically what we need is a 9 by 9 latin square. Fortunately, there is a package for such things:

library(magic)
#> Loading required package: abind
set.seed(1)
shuffles_matrix <- rlatin(9)
shuffles_matrix
#>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#>  [1,]    6    5    4    2    3    9    8    1    7
#>  [2,]    4    2    7    6    9    8    1    3    5
#>  [3,]    8    3    1    5    2    7    9    4    6
#>  [4,]    5    1    9    7    6    2    4    8    3
#>  [5,]    3    6    5    1    8    4    7    9    2
#>  [6,]    9    7    8    3    1    6    5    2    4
#>  [7,]    7    9    3    4    5    1    2    6    8
#>  [8,]    2    8    6    9    4    5    3    7    1
#>  [9,]    1    4    2    8    7    3    6    5    9

Now we can treat each row of this square as a shuffle of our original 3x3 matrix:

shuffles <- split(shuffles_matrix, 1:9)
shuffles
#> $`1`
#> [1] 6 5 4 2 3 9 8 1 7
#> 
#> $`2`
#> [1] 4 2 7 6 9 8 1 3 5
#> 
#> $`3`
#> [1] 8 3 1 5 2 7 9 4 6
#> 
#> $`4`
#> [1] 5 1 9 7 6 2 4 8 3
#> 
#> $`5`
#> [1] 3 6 5 1 8 4 7 9 2
#> 
#> $`6`
#> [1] 9 7 8 3 1 6 5 2 4
#> 
#> $`7`
#> [1] 7 9 3 4 5 1 2 6 8
#> 
#> $`8`
#> [1] 2 8 6 9 4 5 3 7 1
#> 
#> $`9`
#> [1] 1 4 2 8 7 3 6 5 9

And this is how we apply these shuffles to the matrix:

library(purrr)
shuffles %>% 
  map(~matrix(m[.], nrow = 3))
#> $`1`
#>      [,1] [,2] [,3]
#> [1,]   43   39   82
#> [2,]   87    1   68
#> [3,]   34   59   14
#> 
#> $`2`
#>      [,1] [,2] [,3]
#> [1,]   34   43   68
#> [2,]   39   59    1
#> [3,]   14   82   87
#> 
#> $`3`
#>      [,1] [,2] [,3]
#> [1,]   82   87   59
#> [2,]    1   39   34
#> [3,]   68   14   43
#> 
#> $`4`
#>      [,1] [,2] [,3]
#> [1,]   87   14   34
#> [2,]   68   43   82
#> [3,]   59   39    1
#> 
#> $`5`
#>      [,1] [,2] [,3]
#> [1,]    1   68   14
#> [2,]   43   82   59
#> [3,]   87   34   39
#> 
#> $`6`
#>      [,1] [,2] [,3]
#> [1,]   59    1   87
#> [2,]   14   68   39
#> [3,]   82   43   34
#> 
#> $`7`
#>      [,1] [,2] [,3]
#> [1,]   14   34   39
#> [2,]   59   87   43
#> [3,]    1   68   82
#> 
#> $`8`
#>      [,1] [,2] [,3]
#> [1,]   39   59    1
#> [2,]   82   34   14
#> [3,]   43   87   68
#> 
#> $`9`
#>      [,1] [,2] [,3]
#> [1,]   68   82   43
#> [2,]   34   14   87
#> [3,]   39    1   59

Upvotes: 2

MrFlick
MrFlick

Reputation: 206167

I think Iaroslav's answer is excellent. I used some different functions to basically do the same thing so I thought I would share some other code. Basically I also created a latin square type formation but I didn't realize that was the name. I did that with

roll <- function(x, i) {
  if (i==0) return(x)
  c(x[-(1:i)], x[1:i])
}
m <- sapply(0:24, function(i) roll(1:25, i))

here I just uses the number 1:25. It creates a matrix where each row or column is a set of indices that can be used to permute your values. If it looks too orderly, you can also shuffle the rows and columns of the matrix with another helper function

shuffle_mat <- function(x, N=50, margin=c(1,2)) {
  mg <- sample(margin, N, replace=TRUE)
  n_row_swap = sum(mg==1)
  sr <- replicate(n_row_swap, sample.int(nrow(x), 2))
  for(i in 1:ncol(sr)) {
    x[sr[,i],]<-x[rev(sr[,i]),]
  }
  n_col_swap = sum(mg==2)
  sc <- replicate(n_col_swap, sample.int(ncol(x), 2))
  for(i in 1:ncol(sc)) {
    x[,sc[,i]]<-x[,rev(sc[,i])]
  }
  x
}    
rr <- shuffle_mat(m)

Then again you can take each of those rows/columns and shape them into a 5x5 matrix.

Upvotes: 1

Related Questions