JC3019
JC3019

Reputation: 383

Create subset matrix according to criteria/ Extract key rows according to criteria

I want to subset the rows of my original matrix into two separate matrices. I setup the problem as follows:

set.seed(2)
Mat1 <- data.frame(matrix(nrow = 4, ncol =10, data = rnorm(40,0,1)))
keep.rows = matrix(nrow =2, ncol =4)
keep.rows[,1] = c(1,2)
keep.rows[,2] = c(2,3)
keep.rows[,3] = c(2,3)
keep.rows[,4] = c(1,2)

Mat1
          X1         X2         X3        X4           X5         X6         X7         X8        X9        X10
1  0.9959846 -2.2079198 -0.3869496 -1.183606  1.959357077  1.0744594 -0.8621983 -0.4213736 0.4718595  1.2309537
2 -1.6957649  1.8221225  0.3866950 -1.358457  0.007645872  0.2605978  2.0480403 -0.3508344 1.3589398  1.1471368
3 -0.5333721 -0.6533934  1.6003909 -1.512671 -0.842615198 -0.3142720  0.9399201 -1.0273806 0.5641686  0.1065980
4 -1.3722695 -0.2846812  1.6811550 -1.253105 -0.601160105 -0.7496301  2.0086871 -0.2505191 0.4559801 -0.7833167

Mat 1 is my original matrix. Now from the Keep rows matrix, I want to create two output matrices. The first output matrix (Output1) should store all the row numbers specified in keep.row. The second output(Output2) matrix should store all remaining rows. In my actual application my matrices are very large and so cannot be sorted manually as i do here.

I need: 1) I need a function that does this simply over large matrices. 2) Ideally one where i can change the number of entries to "keep" each time. So in this case I store 3 entries. However, imagine if my keep.rows matrix was 2x2. In this case, I might want to store five entries each time.

Results should be of the form:

Output1 <- data.frame(matrix(nrow = 2, ncol =10))
Output1[1:2,1:3] <- Mat1[c(1,2), 1:3]
Output1[1:2,4:6] <- Mat1[c(2,3), 4:6]
Output1[1:2,7:9] <- Mat1[c(2,3), 7:9]
Output1[1:2,10]  <- Mat1[c(1,2), 10]

Output2 <- data.frame(matrix(nrow = 2, ncol =10))
Output2[1:2,1:3] <- Mat1[c(3,4), 1:3]
Output2[1:2,4:6] <- Mat1[c(1,4), 4:6]
Output2[1:2,7:9] <- Mat1[c(1,4), 7:9]
Output2[1:2,10]  <- Mat1[c(3,4), 10]



IMPORTANT: In the answer i need output 2 to be specified in a way that keeps all remaining rows. In my application my keep.row matrix is the same size. But Mat1 contains 1000 rows +

Upvotes: 1

Views: 47

Answers (1)

GKi
GKi

Reputation: 39667

You can use sapply which iterates over the columns of Mat1 with seq_along(Mat1) and subset Mat1 using keep.rows. With cbind you get a matrix-like data.frame from the returned list of sapply. To get the remaining data you simply place a - before keep.rows.

Output1 <- do.call(cbind, sapply(seq_along(Mat1), function(i) Mat1[keep.rows[,(i+2) %/% 3], i, drop = FALSE], simplify = FALSE))
Output2 <- do.call(cbind, sapply(seq_along(Mat1), function(i) Mat1[-keep.rows[,(i+2) %/% 3], i, drop = FALSE], simplify = FALSE))

Upvotes: 1

Related Questions