How to only show values with certain column names in matrix or heatmap

Question

I have a matrix which has matched samples between the row and column. I have produced a nice heatmap from this matrix but for the sake of clarity I want to show only those samples when the column name matches the row name according to a grep The data:

    structure(c(447, 439, 365, 359, 342, 341, 382, 356, 364, 309, 
295, 527, 410, 415, 323, 291, 292, 266, 323, 337, 309, 366, 331, 
284, 414, 425, 316, 419, 420, 350, 301, 319, 293, 335, 360, 312, 
341, 313, 284, 328, 327, 312, 344, 361, 324, 323, 328, 357, 426, 
412, 338, 309, 291, 493, 406, 427, 328, 307, 283, 555, 317, 302, 
501, 312, 289, 480, 423, 419, 336, 299, 274, 241, 286, 262, 221, 
418, 415, 352, 310, 291, 572, 445, 428, 361, 320, 317, 543, 324, 
291, 479, 320, 301, 538, 327, 308, 553, 423, 433, 332, 347, 345, 
324, 347, 345, 324, 410, 413, 351, 340, 322, 558, 331, 321, 581, 
306, 285, 525, 407, 397, 327, 325, 310, 565, 309, 314, 461, 339, 
335, 304, 369, 342, 349, 292, 301, 477, 321, 302, 450, 325, 314, 
582, 275, 252, 456, 296, 291, 526, 401, 414, 326, 335, 316, 582
), .Dim = c(3L, 51L), .Dimnames = list(c("AH_026T", "AH_058T", 
"AH_084T"), c("AH_026C", "AH_026C.1", "AH_026C.2", "AH_084C", 
"AH_086C", "AH_086C.1", "AH_086C.2", "AH_086C.3", "AH_088C", 
"AH_094C", "AH_094C.1", "AH_094C.2", "AH_094C.3", "AH_094C.4", 
"AH_094C.5", "AH_094C.6", "AH_094C.7", "AH_096C", "AH_100C", 
"AH_100C.1", "AH_127C", "AH_133C", "ED_008C", "ED_008C.1", "ED_008C.2", 
"ED_008C.3", "ED_016C", "ED_031C", "ED_036C", "GS_001C", "QE_062C", 
"RS_010C", "RS_027C", "RS_027C.1", "RS_027C.2", "SH_051C", "ST_014C", 
"ST_014C.1", "ST_020C", "ST_024C", "ST_033C", "ST_034C", "ST_034C.1", 
"ST_034C.2", "ST_035C", "ST_036C", "ST_040C", "WG_002C", "WG_005C", 
"WG_006C", "WG_019C")))

The grep should match the first 6 characters only. If the column name and row name doesn't match then it can just be NA or zero.

I tried:

tryout<- subset(res_matrixMatchRes,colnames(res_matrixMatchRes)==rownames(res_matrixMatchRes))

but it doesnt work. I think I probably need to use lapply to iterate through the columns and see if it matches each row but then I'm stumped

LyzandeR · Accepted Answer

One way using sapply:

#use sapply to check the names
#this checks the 6 first characters of the row.names against the
#first 6 characters of the column names. If they match it produces TRUE
#else it produces FALSE
new_mat <- t(sapply(row.names(df),
                    function(x) substr(x, 1, 6) == substr(colnames(df), 1, 6)))

#then you just subset the original df using the logical matrix created previously 
df[!new_mat] <- 0

Output:

> df
        AH_026C AH_026C.1 AH_026C.2 AH_084C AH_086C AH_086C.1 AH_086C.2 AH_086C.3 AH_088C AH_094C AH_094C.1 AH_094C.2 AH_094C.3
AH_026T     447       359       382       0       0         0         0         0       0       0         0         0         0
AH_058T       0         0         0       0       0         0         0         0       0       0         0         0         0
AH_084T       0         0         0     527       0         0         0         0       0       0         0         0         0
#result truncated...

How to only show values with certain column names in matrix or heatmap

Answers (1)

Related Questions