Reputation: 6874
I have a matrix which has matched samples between the row and column. I have produced a nice heatmap from this matrix but for the sake of clarity I want to show only those samples when the column name matches the row name according to a grep The data:
structure(c(447, 439, 365, 359, 342, 341, 382, 356, 364, 309,
295, 527, 410, 415, 323, 291, 292, 266, 323, 337, 309, 366, 331,
284, 414, 425, 316, 419, 420, 350, 301, 319, 293, 335, 360, 312,
341, 313, 284, 328, 327, 312, 344, 361, 324, 323, 328, 357, 426,
412, 338, 309, 291, 493, 406, 427, 328, 307, 283, 555, 317, 302,
501, 312, 289, 480, 423, 419, 336, 299, 274, 241, 286, 262, 221,
418, 415, 352, 310, 291, 572, 445, 428, 361, 320, 317, 543, 324,
291, 479, 320, 301, 538, 327, 308, 553, 423, 433, 332, 347, 345,
324, 347, 345, 324, 410, 413, 351, 340, 322, 558, 331, 321, 581,
306, 285, 525, 407, 397, 327, 325, 310, 565, 309, 314, 461, 339,
335, 304, 369, 342, 349, 292, 301, 477, 321, 302, 450, 325, 314,
582, 275, 252, 456, 296, 291, 526, 401, 414, 326, 335, 316, 582
), .Dim = c(3L, 51L), .Dimnames = list(c("AH_026T", "AH_058T",
"AH_084T"), c("AH_026C", "AH_026C.1", "AH_026C.2", "AH_084C",
"AH_086C", "AH_086C.1", "AH_086C.2", "AH_086C.3", "AH_088C",
"AH_094C", "AH_094C.1", "AH_094C.2", "AH_094C.3", "AH_094C.4",
"AH_094C.5", "AH_094C.6", "AH_094C.7", "AH_096C", "AH_100C",
"AH_100C.1", "AH_127C", "AH_133C", "ED_008C", "ED_008C.1", "ED_008C.2",
"ED_008C.3", "ED_016C", "ED_031C", "ED_036C", "GS_001C", "QE_062C",
"RS_010C", "RS_027C", "RS_027C.1", "RS_027C.2", "SH_051C", "ST_014C",
"ST_014C.1", "ST_020C", "ST_024C", "ST_033C", "ST_034C", "ST_034C.1",
"ST_034C.2", "ST_035C", "ST_036C", "ST_040C", "WG_002C", "WG_005C",
"WG_006C", "WG_019C")))
The grep should match the first 6 characters only. If the column name and row name doesn't match then it can just be NA or zero.
I tried:
tryout<- subset(res_matrixMatchRes,colnames(res_matrixMatchRes)==rownames(res_matrixMatchRes))
but it doesnt work. I think I probably need to use lapply to iterate through the columns and see if it matches each row but then I'm stumped
Upvotes: 1
Views: 159
Reputation: 37879
One way using sapply
:
#use sapply to check the names
#this checks the 6 first characters of the row.names against the
#first 6 characters of the column names. If they match it produces TRUE
#else it produces FALSE
new_mat <- t(sapply(row.names(df),
function(x) substr(x, 1, 6) == substr(colnames(df), 1, 6)))
#then you just subset the original df using the logical matrix created previously
df[!new_mat] <- 0
Output:
> df
AH_026C AH_026C.1 AH_026C.2 AH_084C AH_086C AH_086C.1 AH_086C.2 AH_086C.3 AH_088C AH_094C AH_094C.1 AH_094C.2 AH_094C.3
AH_026T 447 359 382 0 0 0 0 0 0 0 0 0 0
AH_058T 0 0 0 0 0 0 0 0 0 0 0 0 0
AH_084T 0 0 0 527 0 0 0 0 0 0 0 0 0
#result truncated...
Upvotes: 1