Reputation: 79
I have a data frame and I would like to randomly shuffle of the order of certain columns. For instance, here is an example to represent what I have:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
ch1 12 A 9 10 6 5 6 3 4 6 5 7 5
ch2 13 T 7 10 1 7 3 3 3 1 7 8 6
ch3 14 T 5 7 7 2 6 8 1 1 8 1 5
ch4 15 G 8 9 2 7 9 7 7 3 10 10 4
ch5 16 T 8 2 8 2 4 7 1 8 10 3 2
I would like to keep the first three columns the same (V1-V3), then randomly shuffle the order of the remaining columns (V4-V14). For example,
V1 V2 V3 V6 V5 V11 V4 V14 V10 V8 V7 V13 V12 V9
ch1 12 A 6 10 6 9 5 4 6 5 7 5 3
ch2 13 T 1 10 1 7 6 3 3 7 8 7 3
ch3 14 T 7 7 1 5 5 1 6 2 1 8 8
ch4 15 G 2 9 3 8 4 7 9 7 10 10 7
ch5 16 T 8 2 8 8 2 1 4 2 3 10 7
I've found a number of method for permuting rows within a column, but haven't come across any method for shuffling the order of columns. Any help would be appreciated.
Upvotes: 3
Views: 97
Reputation: 99391
You can leave the first three columns alone while shuffling all the others with
df[c(1:3, sample(4:ncol(df)))]
Let's have a look at how this works with mtcars
, by returning just the column names.
replicate(7, names(mtcars[c(1:3, sample(4:ncol(mtcars)))]))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] "mpg" "mpg" "mpg" "mpg" "mpg" "mpg" "mpg"
# [2,] "cyl" "cyl" "cyl" "cyl" "cyl" "cyl" "cyl"
# [3,] "disp" "disp" "disp" "disp" "disp" "disp" "disp"
# [4,] "carb" "am" "hp" "qsec" "am" "am" "am"
# [5,] "wt" "hp" "drat" "carb" "carb" "wt" "carb"
# [6,] "hp" "gear" "vs" "drat" "qsec" "gear" "qsec"
# [7,] "drat" "carb" "am" "vs" "wt" "vs" "hp"
# [8,] "qsec" "vs" "carb" "wt" "gear" "hp" "gear"
# [9,] "vs" "drat" "gear" "am" "drat" "drat" "drat"
# [10,] "am" "wt" "qsec" "gear" "vs" "qsec" "wt"
# [11,] "gear" "qsec" "wt" "hp" "hp" "carb" "vs"
We can see that the first three column names (shown as rows here) remain the same for each run, while the others vary.
Upvotes: 5