user3381331
user3381331

Reputation: 79

How to randomize a subset of the column order

I have a data frame and I would like to randomly shuffle of the order of certain columns. For instance, here is an example to represent what I have:

V1  V2  V3  V4  V5  V6  V7  V8  V9  V10 V11 V12 V13 V14
ch1 12  A   9   10  6   5   6   3   4   6   5   7   5
ch2 13  T   7   10  1   7   3   3   3   1   7   8   6
ch3 14  T   5   7   7   2   6   8   1   1   8   1   5
ch4 15  G   8   9   2   7   9   7   7   3   10  10  4
ch5 16  T   8   2   8   2   4   7   1   8   10  3   2

I would like to keep the first three columns the same (V1-V3), then randomly shuffle the order of the remaining columns (V4-V14). For example,

V1  V2  V3  V6  V5  V11 V4  V14 V10 V8  V7  V13 V12 V9
ch1 12  A   6   10  6   9   5   4   6   5   7   5   3
ch2 13  T   1   10  1   7   6   3   3   7   8   7   3
ch3 14  T   7   7   1   5   5   1   6   2   1   8   8
ch4 15  G   2   9   3   8   4   7   9   7   10  10  7
ch5 16  T   8   2   8   8   2   1   4   2   3   10  7

I've found a number of method for permuting rows within a column, but haven't come across any method for shuffling the order of columns. Any help would be appreciated.

Upvotes: 3

Views: 97

Answers (1)

Rich Scriven
Rich Scriven

Reputation: 99391

You can leave the first three columns alone while shuffling all the others with

df[c(1:3, sample(4:ncol(df)))]

Let's have a look at how this works with mtcars, by returning just the column names.

replicate(7, names(mtcars[c(1:3, sample(4:ncol(mtcars)))]))
#       [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]  
#  [1,] "mpg"  "mpg"  "mpg"  "mpg"  "mpg"  "mpg"  "mpg" 
#  [2,] "cyl"  "cyl"  "cyl"  "cyl"  "cyl"  "cyl"  "cyl" 
#  [3,] "disp" "disp" "disp" "disp" "disp" "disp" "disp"
#  [4,] "carb" "am"   "hp"   "qsec" "am"   "am"   "am"  
#  [5,] "wt"   "hp"   "drat" "carb" "carb" "wt"   "carb"
#  [6,] "hp"   "gear" "vs"   "drat" "qsec" "gear" "qsec"
#  [7,] "drat" "carb" "am"   "vs"   "wt"   "vs"   "hp"  
#  [8,] "qsec" "vs"   "carb" "wt"   "gear" "hp"   "gear"
#  [9,] "vs"   "drat" "gear" "am"   "drat" "drat" "drat"
# [10,] "am"   "wt"   "qsec" "gear" "vs"   "qsec" "wt"  
# [11,] "gear" "qsec" "wt"   "hp"   "hp"   "carb" "vs"  

We can see that the first three column names (shown as rows here) remain the same for each run, while the others vary.

Upvotes: 5

Related Questions