Reputation: 148

Subsetting while working with list, dataframe and matrix

I have three form of data.

a data frame, info.data as

id.num <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 20, 21, 22, 23, 25, 30, 31, 32, 33, 34, 35) 
id.name <- c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fifteen", "twenty", "tyone", "tytwo", "tythre","tyfive", "thrty", "thrtyone", "thrtytwo", "thrtythree", "thrtyfour", "thrtyfiv") 
info.data <- data.frame(id.num, id.name) 
row.names(info.data)<- c("x1","x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x15", "x20", "x21", "x22", "x23", "x25","x30", "x31","x32", "x33", "x34","x35")

a matrix, mat, with some common rownames as info.data,

mat <- matrix(c(sample(0:1, 100, replace=T)), nrow=10, ncol=10)
diag(mat)<-0
t2 <- lower.tri(mat)
mat[lower.tri(mat)] <- t(mat)[lower.tri(mat)]
row.names(mat) <- c(paste("x",3:12,sep=""))
colnames(mat)<-c(paste("x",3:12,sep=""))

and a list, req.l, with some common id.names of info.data.

req.l<- list(L1=info.data$id.name[2:8],LL1=(info.data$id.name[1:5]),LLL1=(info.data$id.name[8:21]))

i wan to choose a list, say LL1, and subset corresponding matrix from mat (whichever values are present) such that the output would be a subset (with corresponding list values as col/row names) be following,

          three four  five
three        0      0     1
four         0      0     0 
five         1      0     0

I tried using %in% in couple of lines as a result the code getting lengthy. Further, I need to change the list name etc. every time, which is creating confusion, which in turn making my brain to stop !!
is there a neat way to do such task? can grep be used in such situation?

Upvotes: 0

Answers (2)

MrFlick

Reputation: 206401

There are a few steps to jump through here, but let's break it down

First, we need to find the rows in info.data for the values in the list we choose. We can do that with

info.data$id.name %in% req.l[["L1"]]

now we need to find the row names those values correspond to because those are the names in the matrix.

rownames(info.data)[info.data$id.name %in% req.l[["L1"]]]

does that. Now we want only those names that are also in the matrix, so we'll just take the overlapping values

intersect(
    rownames(info.data)[info.data$id.name %in% req.l[["L1"]]], 
    colnames(mat)
)

This is finally the list of rows/cols we want from mat. Now we can subset

mc <- intersect(
    rownames(info.data)[info.data$id.name %in% req.l[["L1"]]], 
    colnames(mat)
)
mat[mc,mc]

And then we need to rename the dimensions so here we do back to the data.frame to get them

out <- mat[mc,mc]
dimnames(out) <- replicate(2, info.data[mc,"id.name"], simplify=F)
out

And since this was all based off the string "L1", you can easly replace that value with ever you want or a variable.

Upvotes: 1

alexis_laz

Reputation: 13122

There have to be better ways, but this seems valid too:

lapply(req.l, 
       function(X) {
          tmp = rownames(info.data)[match(X, info.data$id.name)]
          dmnms = replicate(2, as.character(X[tmp %in% unique(unlist(dimnames(mat)))]), simplify = F)
          ret = do.call("[", c(list(mat), 
                               lapply(dimnames(mat), 
                                         function(x) 
                                            na.omit(match(tmp, x)))))
          dimnames(ret) = dmnms
          ret
       })
#$L1
#      three four five six seven eight
#three     0    0    0   0     0     1
#four      0    0    1   0     0     0
#five      0    1    0   1     1     0
#six       0    0    1   0     1     1
#seven     0    0    1   1     0     0
#eight     1    0    0   1     0     0
#
#$LL1
#      three four five
#three     0    0    0
#four      0    0    1
#five      0    1    0
#
#$LLL1
#       eight nine ten eleven twelve
#eight      0    0   0      1      0
#nine       0    0   1      0      1
#ten        0    1   0      1      1
#eleven     1    0   1      0      1
#twelve     0    1   1      1      0

Upvotes: 2

Subsetting while working with list, dataframe and matrix

Answers (2)

Related Questions