Reputation: 55
I have a list in which there is a 5x5 matrix data set. I want to randomly select 2 rows and within each row I want to select 3 elements not necessarily from same columns.
So, I generated three data sets and made a list. I was able to select 2 rows randomly but have a difficulty in selecting 3 elements randomly not selecting columns.
Here is my code.
### Generate three data sets
dat1 <- (matrix(rnorm(25), ncol=5))
dat2 <- (matrix(rnorm(25), ncol=5))
dat3 <- (matrix(rnorm(25), ncol=5))
all.dat <- list(dat1=dat1, dat2=dat2, dat3=dat3)
all.dat
#$`dat1`
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1.4394742 0.7064418 -1.3472468 0.52847179 -0.7642337
#[2,] 0.2490570 0.7510308 -0.7028238 -0.09730666 -0.6340773
#[3,] 0.8981850 0.7592610 0.9139721 -0.45700647 -0.2727481
#[4,] -1.0467119 0.2147032 -3.2104254 -0.17797056 0.8897180
#[5,] -0.5437118 0.5803862 -0.1814992 1.93316139 -1.3708932
#$dat2
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1.0442187 -1.4156893 0.5606035101 -1.350030718 0.1538721
#[2,] 0.2080905 -1.7748005 0.8620324724 -0.169071336 -1.7537700
#[3,] 0.9153835 -0.9884572 -1.7279901136 -1.334516414 0.5773021
#[4,] 0.1359423 -1.5107088 -1.4289650078 -0.002001498 -0.4712699
#[5,] 0.1695023 -0.7315209 -0.0003996577 -1.043326258 1.2939485
#$dat3
# [,1] [,2] [,3] [,4] [,5]
#[1,] -1.4994878 -0.59179084 0.998017255 1.4021344 0.5929842
#[2,] 0.3424003 1.33568858 2.214968765 -0.2434351 1.3588000
#[3,] -1.0117892 0.91065720 -0.761932994 -0.8117838 -0.4094731
#[4,] -0.1694781 -0.02937177 -0.826337270 0.2178774 -0.6427046
#[5,] 0.3413101 -0.56911900 0.001363063 0.5579126 -0.9373204
### Select rows and columns.
all.dat.sel.1 <-
lapply(all.dat, function(x) {
x[sample(nrow(x), size = 2), sample(ncol(x), size = 3)]
})
all.dat.sel.1
#$`dat1`
# [,1] [,2] [,3]
#[1,] -0.4570065 0.8981850 -0.2727481
#[2,] 1.9331614 -0.5437118 -1.3708932
#$dat2
# [,1] [,2] [,3]
#[1,] -0.0003996577 -1.043326258 1.2939485
#[2,] -1.4289650078 -0.002001498 -0.4712699
#$dat3
# [,1] [,2] [,3]
#[1,] -1.4994878 1.4021344 0.9980173
#[2,] -0.1694781 0.2178774 -0.8263373
Then, I was able to select rows randomly but elements in each row were from the same columns. For example, values, -1.4994878 in row 1 and -0.1694781 in row 2 were from column 1 in dat3.
What I would like to have is something like this:
#$dat3
# [,1] [,2] [,3]
#[1,] -1.4994878 0.998017255 0.5929842
#[4,] 0.2178774 -0.02937177 -0.826337270
There is an example of this (https://stackoverflow.com/questions/53095050/sample-random-column-for-each-row-in-data-frame
). However, it applied to data frame not list data.
Upvotes: 3
Views: 433
Reputation: 76402
Take advantage of the fact that a matrix is a folded vector, meaning, a vector with a dim
attribute and sample 2*3 vector elements directly.
lapply(all.dat, function(x){
matrix(sample(x, 2*3), nrow = 2)
})
#$dat1
# [,1] [,2] [,3]
#[1,] 0.5060559 -0.5644520 -0.83717168
#[2,] -0.6937202 -0.4771927 0.06445882
#
#$dat2
# [,1] [,2] [,3]
#[1,] -0.709440 -1.340993 0.5747557
#[2,] -1.068643 1.449496 1.1022975
#
#$dat3
# [,1] [,2] [,3]
#[1,] 0.6482866 0.5630558 -0.007604756
#[2,] 0.6565885 1.3295648 -0.669633580
Note: I have started the script with the call set.seed(1234)
.
Edit.
After reading the comment by user @Ronak Shah, and the question again, the code below might be what the OP is looking for. It's similar but not the same as Ronak's solution. Once again, the RNG seed was set to 1234
before the data creation code.
lapply(all.dat, function(x){
t(apply(x[sample(nrow(x), 2), ], 1, sample, size = 3))
})
#$dat1
# [,1] [,2] [,3]
#[1,] -0.4771927 -1.207066 0.5060559
#[2,] -0.4405479 1.084441 -0.9111954
#
#$dat2
# [,1] [,2] [,3]
#[1,] 1.1022975 -0.9685143 1.449496
#[2,] -0.2942939 -0.5012581 -0.280623
#
#$dat3
# [,1] [,2] [,3]
#[1,] -0.3665239 -0.773353424 1.367827
#[2,] 0.3364728 -0.007604756 2.070271
Upvotes: 2
Reputation: 388862
I think what you are trying to do is
row_const <- 2
col_const <- 3
lapply(all.dat, function(x) {
rand_rows <- sample(nrow(x), size = row_const)
t(sapply(rand_rows, function(y) sample(x[y, ], col_const)))
})
#$dat1
# [,1] [,2] [,3]
#[1,] 0.07050839 -0.6868529 0.7013559
#[2,] 0.40077145 -1.0260044 -1.9666172
#$dat2
# [,1] [,2] [,3]
#[1,] -0.3059627 -1.138137 2.1689560
#[2,] -0.2950715 0.837787 0.5539177
#$dat3
# [,1] [,2] [,3]
#[1,] 0.3796395 -0.4910312 0.2533185
#[2,] 0.9222675 0.1238542 -1.0185754
It first selects two random rows from the each matrix and then selects 3 random elements from each row.
data
set.seed(123)
dat1 <- (matrix(rnorm(25), ncol=5))
dat2 <- (matrix(rnorm(25), ncol=5))
dat3 <- (matrix(rnorm(25), ncol=5))
all.dat <- list(dat1=dat1, dat2=dat2, dat3=dat3)
Upvotes: 1