DaniCee
DaniCee

Reputation: 3217

R: create boolean matrix based on data matrix and thresholds data frame

Say I have a data matrix like this one:

> data(mtcars)
> my_mat <- as.matrix(mtcars[,1:7])
> head(my_mat)
                   mpg cyl disp  hp drat    wt  qsec
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02
Datsun 710        22.8   4  108  93 3.85 2.320 18.61
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02
Valiant           18.1   6  225 105 2.76 3.460 20.22

... and a thresholds data frame like the following one, with threshold values for some of the columns in the matrix above:

> threshold_df <- data.frame(marker=colnames(my_mat)[c(3,4,6)], threshold=apply(my_mat[,c(3,4,6)], 2, quantile, 0.75))
> threshold_df
     marker threshold
disp   disp    326.00
hp       hp    180.00
wt       wt      3.61

With this information, I want to end up with a matrix of 0s and 1s, identical to my_mat, where everything is a 0 except for the values of a column which are higher than the threshold for that column.

So far I have the all-0 matrix, but not sure how to populate the 1s based on the above information... Any clue? Thanks!

> zero_mat <- matrix(0, nrow = nrow(my_mat), ncol = ncol(my_mat))
> colnames(zero_mat) <- colnames(my_mat)
> rownames(zero_mat) <- rownames(my_mat)
> head(zero_mat)
                  mpg cyl disp hp drat wt qsec
Mazda RX4           0   0    0  0    0  0    0
Mazda RX4 Wag       0   0    0  0    0  0    0
Datsun 710          0   0    0  0    0  0    0
Hornet 4 Drive      0   0    0  0    0  0    0
Hornet Sportabout   0   0    0  0    0  0    0
Valiant             0   0    0  0    0  0    0

Upvotes: 2

Views: 248

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 102920

Another base R option

u <- with(threshold_df, mapply(`>`, as.data.frame(my_mat[, marker]), threshold))
my_mat <- 0 * my_mat
my_mat[, colnames(u)] <- u

gives

> my_mat
                    mpg cyl disp hp drat wt qsec
Mazda RX4             0   0    0  0    0  0    0
Mazda RX4 Wag         0   0    0  0    0  0    0
Datsun 710            0   0    0  0    0  0    0
Hornet 4 Drive        0   0    0  0    0  0    0
Hornet Sportabout     0   0    1  0    0  0    0
Valiant               0   0    0  0    0  0    0
Duster 360            0   0    1  1    0  0    0
Merc 240D             0   0    0  0    0  0    0
Merc 230              0   0    0  0    0  0    0
Merc 280              0   0    0  0    0  0    0
Merc 280C             0   0    0  0    0  0    0
Merc 450SE            0   0    0  0    0  1    0
Merc 450SL            0   0    0  0    0  1    0
Merc 450SLC           0   0    0  0    0  1    0
Cadillac Fleetwood    0   0    1  1    0  1    0
Lincoln Continental   0   0    1  1    0  1    0
Chrysler Imperial     0   0    1  1    0  1    0
Fiat 128              0   0    0  0    0  0    0
Honda Civic           0   0    0  0    0  0    0
Toyota Corolla        0   0    0  0    0  0    0
Toyota Corona         0   0    0  0    0  0    0
Dodge Challenger      0   0    0  0    0  0    0
AMC Javelin           0   0    0  0    0  0    0
Camaro Z28            0   0    1  1    0  1    0
Pontiac Firebird      0   0    1  0    0  1    0
Fiat X1-9             0   0    0  0    0  0    0
Porsche 914-2         0   0    0  0    0  0    0
Lotus Europa          0   0    0  0    0  0    0
Ford Pantera L        0   0    1  1    0  0    0
Ferrari Dino          0   0    0  0    0  0    0
Maserati Bora         0   0    0  1    0  0    0
Volvo 142E            0   0    0  0    0  0    0

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389355

You can use sweep on selected columns.

zero_mat[,threshold_df$marker] <- +(sweep(my_mat[, threshold_df$marker], 2, threshold_df$threshold, `>`))

zero_mat

#                    mpg cyl disp hp drat wt qsec
#Mazda RX4             0   0    0  0    0  0    0
#Mazda RX4 Wag         0   0    0  0    0  0    0
#Datsun 710            0   0    0  0    0  0    0
#Hornet 4 Drive        0   0    0  0    0  0    0
#Hornet Sportabout     0   0    1  0    0  0    0
#Valiant               0   0    0  0    0  0    0
#Duster 360            0   0    1  1    0  0    0
#Merc 240D             0   0    0  0    0  0    0
#Merc 230              0   0    0  0    0  0    0
#...
#...

The same logic will also work with transpose.

zero_mat[,threshold_df$marker] <- +(t(t(my_mat[, threshold_df$marker]) > threshold_df$threshold))

+ at the beginning changes the logical values (T/F) to integer values (1/0).

Upvotes: 2

Related Questions