Reputation: 148
I have three form of data.
a data frame, info.data
as
id.num <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 20, 21, 22, 23, 25, 30, 31, 32, 33, 34, 35)
id.name <- c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fifteen", "twenty", "tyone", "tytwo", "tythre","tyfive", "thrty", "thrtyone", "thrtytwo", "thrtythree", "thrtyfour", "thrtyfiv")
info.data <- data.frame(id.num, id.name)
row.names(info.data)<- c("x1","x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x15", "x20", "x21", "x22", "x23", "x25","x30", "x31","x32", "x33", "x34","x35")
a matrix, mat
, with some common rownames as info.data
,
mat <- matrix(c(sample(0:1, 100, replace=T)), nrow=10, ncol=10)
diag(mat)<-0
t2 <- lower.tri(mat)
mat[lower.tri(mat)] <- t(mat)[lower.tri(mat)]
row.names(mat) <- c(paste("x",3:12,sep=""))
colnames(mat)<-c(paste("x",3:12,sep=""))
and a list
, req.l
, with some common id.names
of info.data
.
req.l<- list(L1=info.data$id.name[2:8],LL1=(info.data$id.name[1:5]),LLL1=(info.data$id.name[8:21]))
i wan to choose a list, say LL1
, and subset corresponding matrix from mat
(whichever values are present) such that the output would be a subset (with corresponding list values as col/row names) be following,
three four five
three 0 0 1
four 0 0 0
five 1 0 0
I tried using %in%
in couple of lines as a result the code getting lengthy. Further, I need to change the list name etc. every time, which is creating confusion, which in turn making my brain to stop !!
is there a neat way to do such task? can grep
be used in such situation?
Upvotes: 0
Views: 96
Reputation: 206401
There are a few steps to jump through here, but let's break it down
First, we need to find the rows in info.data for the values in the list we choose. We can do that with
info.data$id.name %in% req.l[["L1"]]
now we need to find the row names those values correspond to because those are the names in the matrix.
rownames(info.data)[info.data$id.name %in% req.l[["L1"]]]
does that. Now we want only those names that are also in the matrix, so we'll just take the overlapping values
intersect(
rownames(info.data)[info.data$id.name %in% req.l[["L1"]]],
colnames(mat)
)
This is finally the list of rows/cols we want from mat. Now we can subset
mc <- intersect(
rownames(info.data)[info.data$id.name %in% req.l[["L1"]]],
colnames(mat)
)
mat[mc,mc]
And then we need to rename the dimensions so here we do back to the data.frame to get them
out <- mat[mc,mc]
dimnames(out) <- replicate(2, info.data[mc,"id.name"], simplify=F)
out
And since this was all based off the string "L1", you can easly replace that value with ever you want or a variable.
Upvotes: 1
Reputation: 13122
There have to be better ways, but this seems valid too:
lapply(req.l,
function(X) {
tmp = rownames(info.data)[match(X, info.data$id.name)]
dmnms = replicate(2, as.character(X[tmp %in% unique(unlist(dimnames(mat)))]), simplify = F)
ret = do.call("[", c(list(mat),
lapply(dimnames(mat),
function(x)
na.omit(match(tmp, x)))))
dimnames(ret) = dmnms
ret
})
#$L1
# three four five six seven eight
#three 0 0 0 0 0 1
#four 0 0 1 0 0 0
#five 0 1 0 1 1 0
#six 0 0 1 0 1 1
#seven 0 0 1 1 0 0
#eight 1 0 0 1 0 0
#
#$LL1
# three four five
#three 0 0 0
#four 0 0 1
#five 0 1 0
#
#$LLL1
# eight nine ten eleven twelve
#eight 0 0 0 1 0
#nine 0 0 1 0 1
#ten 0 1 0 1 1
#eleven 1 0 1 0 1
#twelve 0 1 1 1 0
Upvotes: 2