xenophanes
xenophanes

Reputation: 75

Get string names of data frames/objects in list

Suppose I have a list of data frames in R.

testList <- list(df1, df2, df3)

What I'd like to do is apply a function to each data frame in the list, to modify the data frame stored in the environment. Here is pseudocode of what I'm trying to do:

>> modify <- function(list.of.dfs){
    for (df in list.of.dfs){
        df$some.new.variable <- some.new.value

>> df
>> (returns new df with new variable created)

That's a simple example; for each data frame in the list, the data frame would now have a new variable with some value.

I've almost found a solution. Basically, it iterates through a list and creates a string command for each data frame, then evaluates it. But the only problem is that when it creates a command, it'll pass in the string version of the index variable, not the name of the data frame:

modify <- function(data.list, functionName){
  for(i in 1:length(data.list)){
    command <- paste0(varText(data.list),
                      "[[",
                      i,
                      "]] <- ",
                      varText(functionName),
                      "(",
                      varText(data.list),
                      "[[",
                      i,
                      "]])"
                      )
    evaluate <- parse(command)
    print(evaluate)
    eval(evaluate)
  }
  data.list
}

where:

varText <- function(object){
  deparse(substitute(object))
}

So I need to find a way to access data frames, pull their names, and iterate through a list of commands featuring the names of those data frames.

I then want to be able to access those modified data frames in the global environment.

...unless someone knows a better solution to doing this.

Edit: A reproducible example

Suppose I create two data frames and add them to the same list:

df1 <- data.frame(rnorm(100), rnorm(100))
df2 <- data.frame(rnorm(100), rnorm(100))
test.list <- list(df1, df2)

And I create a function that trivially edits a data frame that's passed in:

testFunction <- function(data.frame){
   data.frame$new.variable <- 0
}

Then I can use lapply as suggested in the answer:

lapply(test.list, testFunction)

Which returns a list.

However if you call df1 or df2, they remain unchanged. What has been created are modified versions of df1 and df2 but they are stored within the list lapply creates.

I want to just be able to type

df1
df2

And have them be modified.

Is there a way to do this so that you don't have to assign the elements of the list lapply creates to the names of the variables you want to access?

Much, much appreciated!

Upvotes: 0

Views: 237

Answers (1)

AJD
AJD

Reputation: 301

If I understand you correctly, you're looking to apply a function to a list of data frames?

In this instance lapply() is your friend and will be far more efficient than a for() loop.

Based on your revised example, try something like:

# data
df1 <- data.frame(rnorm(100), rnorm(100))
df2 <- data.frame(rnorm(100), rnorm(100))
test.list <- list(df1, df2)

# function
out.list <- lapply(test.list, function(x) {x$.new.variable <- 0; x})

# name the df's in the list and check the output
names(out.list) <- c("df1", "df2")
str(out.list)

This will apply the function to each object within the list, and return the results as a list [with @thelatemail's suggested edit].

If you then want to access the modified dataframes you can just call them by name:

out.list$df1

Or, if you want to return the df's to the global environment, you can use the following provided you have named the df's using the step above:

list2env(out.list ,.GlobalEnv)

That should do what you need.

Upvotes: 1

Related Questions