mjtat
mjtat

Reputation: 13

Selecting out specific columns in data frames embedded within a list

Here's my current problem. I have a list of data frames that consist of different values. I want to be able to iterate through the list of data frames, and select out specific columns of data for each data frame, based on the names of the columns I specify. I want to then assign those selected columns in a separate list of data frames.

I've used another list objects consisting of the names of the different columns I want to extract.

I've taken a stab at a few approaches, but I'm still at the head scratching stage. Help would be appreciated!

Here's some sample code I've cooked up below:

# Create sample data set of five data frames, 10 x 10 

M1 <- data.frame(matrix(rnorm(5), nrow = 10, ncol = 10))
M2 <- data.frame(matrix(rnorm(10), nrow = 10, ncol = 10))
M3 <- data.frame(matrix(rnorm(15), nrow = 10, ncol = 10))
M4 <- data.frame(matrix(rnorm(20), nrow = 10, ncol = 10))
M5 <- data.frame(matrix(rnorm(25), nrow = 10, ncol = 10))

# Assign data frames to a list object

mlist<-list(M1, M2, M3, M4, M5)

# Creates a data frame object consisting of the different column names I want to extract later

df.names <- data.frame(One = c("X1", "X3", "X5"), Two = c("X2", "X4", "X6"))

# Converts df.names into a set of characters (not sure if this is needed but it has worked for me in the past)

df.char <- lapply(df.names, function(x) as.character(x[1:length(x)]))

# Creates variable m that will be used to iterate in the for loops below

m<-1:length(mlist)



# Creates list object to set aside selected columns from df.names   

mlist.selected<-list()

# A for loop to iterate for each of the df.names elements, and for each dataframe in mlist. *Hopefully* select out the columns of interest labeled in df.names, place into another list object for safe keeping
for (i in 1:length(df.names)) 
        {
        for(j in m)
                {
                #T his is the line of code I'm struggling with and I know it doesn't work. :-(
                mlist.selected[j]<-lapply(mlist, function(x) x[df.char[[i]]])

        }
}

Upvotes: 0

Views: 195

Answers (1)

lmo
lmo

Reputation: 38500

Using

mlist.selected[[j]] <- lapply(mlist, function(x) x[df.char[[i]]])

in your for loop will get you a bit closer. I'd suggest using a named list with

mlist.selected[[paste("m",j, names(df.names)[i], sep=".")]] <- 
                                                   lapply(mlist, function(x) x[df.char[[i]]])

to get an even nicer output.

On inspection, this returns repeated lists, which I don't think you want. If I understand what you are trying to do, you can actually get rid of the inner (j) loop:

# create named list of the data.frames
mlist<-list("M1"=M1, "M2"=M2, "M3"=M3, "M4"=M4, "M5"=M5)

# run the loop
for (i in 1:length(df.names)) {
    mlist.selected[[paste(names(df.names)[i], sep=".")]] <-
                                                lapply(mlist, function(x) x[df.char[[i]]])
}

Which returns a nicely named list to work with. For example, you can access the saved vectors data from M2 in df.names$Two using mlist.selected$Two$M2.

Upvotes: 1

Related Questions