Geek On Acid
Geek On Acid

Reputation: 6410

Constrained randomization of column order in a data.frame

I am trying to duplicate each column from data frame and move it to a randomly located point within 1-3 columns and do it for each column in the data frame. I want columns to move AT LEAST one space to the left or right. Of course sample(data) reorders columns randomly, but my attempts to put it in a loop are embarrassingly bad (I admit I skipped majority of linear algebra classes, damn...). Below is an example data:

dat <- read.table(textConnection(
"-515.5718  94.33423 939.6324 -502.9918 -75.14629 946.6926
-515.2283  96.10239 939.5687 -503.1425 -73.39015 946.6360
-515.0044  97.68119 939.4177 -503.4021 -71.79252 946.6909
-514.7430  99.59141 939.3976 -503.6645 -70.08514 946.6887
-514.4449 101.08511 939.2342 -503.9207 -68.48133 946.7183
-514.2769 102.29453 939.0013 -504.2665 -67.04509 946.7809
-513.9294 104.02753 938.9436 -504.4703 -65.34361 946.7899
-513.5900 105.49624 938.7684 -504.7405 -63.75965 946.7991"
),header=F,as.is=T)
sample(dat)#random columns position

Upvotes: 3

Views: 368

Answers (2)

Josh O&#39;Brien
Josh O&#39;Brien

Reputation: 162311

I should probably wait to post this until I have time to comment it up, and discuss some of the ambiguities in the problem as currently specified in the comments above. But since I won't be able to do that, possibly for a while, I thought I'd give you code for a solution that you can examine yourself.

# Create a function that generates acceptable permutations of the data
getPermutation <- function(blockSize,     # number of columns/block
                           nBlock,        # number of blocks of data
                           fromBlocks) {  # indices of blocks to be moved
    X <- unique(as.vector(outer(fromBlocks, c(-2,-1,1,2), "+")))
    # To remove nonsensical indices like 0 or -1
    X <- X[X %in% seq.int(nBlock)]  

    while({toBlocks <- sample(X, size = length(fromBlocks))  
           max(abs(toBlocks - fromBlocks)) > 2 | min(abs(toBlocks - fromBlocks)) < 1
           }) NULL
    A <- seq.int(nBlock)
    A[toBlocks] <- fromBlocks
    A[fromBlocks] <- toBlocks

    blockColIndices <- 
        lapply(seq.int(nBlock) - 1,
               function(X) {
                   seq(from = X * blockSize + 1, 
                       by = 1, 
                       length.out = blockSize)
               })    
    unlist(blockColIndices[A])
}

# Create an example dataset, a 90 column data.frame
dat <- as.data.frame(matrix(seq.int(90*4), ncol=90))

# Call the function for a data frame with 30 3-column blocks
# within which you want to move blocks 2, 14, and 14.
index <- getPermutation(3, 30, c(2, 14, 15))
newdat <- dat[index]

Upvotes: 1

Josh O&#39;Brien
Josh O&#39;Brien

Reputation: 162311

How about this brute-force but plenty-fast solution?

It tries out different permutations of the columns until it finds one in which each column is moved at least 1, and not more than 3 columns to left or right. When it finds such a permutation, the test in the final line of the while() call evaluates to FALSE, terminating the loop and leaving the variable x containing the acceptable permutation.

n <- ncol(dat)
while({x <- sample(n)   # Proposed new column positions
       y <- seq_len(n)  # Original column positions
       max(abs(x - y)) > 3 | min(abs(x - y)) == 0
       }) NULL
dat[x]

Upvotes: 3

Related Questions