Reputation: 703
I have a NxM data.frame MATRIX_1
in R containing a series of values. In addition to this, I have another NxM data.frame MATRIX_2
that contains a 1:1 mapping to the first, but instead of numerical values, they are booleans to tell if that data point falls outside 2 standard deviations from the mean of that particular column.
I'm wanting to remove all rows from my MATRIX_1
in which the corresponding [row, col]
in MATRIX_2
contains a TRUE
value.
MATRIX_2
AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y PROGRESSION
[1,] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Above, the BMI
column has a TRUE
value in the column. Therefore, this entire row should be removed from the MATRIX_1
where MATRIX_1
looks something like the following:
MATRIX_1
AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y PROGRESSION
1 59 2 32.1 101.00 157 93.2 38.0 4.00 4.8598 87 151 1
I've seen some of the following using the %in%
operator, but want this to auto apply to all columns, whereas something like df1[!(df1$name %in% df2$name),]
targets specifically a singular column in the frame.
I'm getting almost successful using subset
subset(diabetes2, boolean_diabetes2[,1] == TRUE)
Upvotes: 1
Views: 46
Reputation: 54237
To select all rows from MATRIX_1
, where the corresponding rows in MATRIX_2
contains all FALSE
values, you could do:
# sample data
set.seed(1)
MATRIX_2 <- matrix(sample(c(T,F), 3*4, T, prob = c(.3,.7)), ncol=3)
MATRIX_1 <- as.data.frame(matrix(runif(3*4), ncol=3))
# subsetting
MATRIX_1[!rowSums(MATRIX_2),]
Upvotes: 2