mvikred
mvikred

Reputation: 71

Varying variables in R in a for loop

Hi I'm trying to build a code that tries to do the following:

  1. lists all the files in a folder
  2. extracts the data from each of the excel files (each excel file has multiple sheets and I am using XLConnect package to pull the data from different sheets)
  3. reshape the data into the format that works best for my data analysis

Here is my code

for (i in 1:n) { 
  wb <- loadWorkbook(filelist[i]) 


  assign(paste("field",i,"_reportcontents",sep = ""),
         readWorksheet(wb, sheet="Report Contents"))

  assign(paste("field",i,"_company",sep=""),
         paste("field",i,"_reportcontents[31,3]",sep = ""))
}

The way the above code is set up, the second variable, which is field[i]_company is being set as the string "fieldi_reportcontents[31,3]" rather than the value that is in the dataframe field[i]_reportcontents.

How can I fix the code to achieve the values in the data frame rather than allocation of a string?

Upvotes: 0

Views: 131

Answers (2)

Paulo MiraMor
Paulo MiraMor

Reputation: 1610

Just assign the sheet being read to a temporary variable. Also, you can replace paste with sep = "" by paste0, it's the same.

for (i in 1:n) { 
  wb <- loadWorkbook(filelist[i]) 

  temp <- readWorksheet(wb, sheet="Report Contents")

  assign(paste0("field",i,"_company"),
         temp)
}

As for why your last assign statement isn't doing what you want, I'll point you to an answer I wrote yesterday. Basically, the value you are trying to assign, paste("field",i,"_reportcontents[31,3]",sep = ""), is just a string. You can use get with a string to use a variable, but in your case you aren't trying to assign a variable either because you are also using the function [, so you would need to parse and evaluate the string you construct. In the end, the list approach is a better way to go.

Upvotes: 0

Gregor Thomas
Gregor Thomas

Reputation: 145755

Rather than use assign and pasting together variable names, I would use lists.

wb = lapply(filelist, loadWorkbook)
sheets = lapply(wb, readWorksheet, sheet = "Report Contents")
companies = lapply(sheets, "[", 31, 3) 

You could easily set the names of the lists, e.g.,

names(sheets) = sprintf("field_%s_reportcontents", seq_along(sheets))

But it isn't clear if this is necessary if you just use good object names with your lists.

See also How to make a list of data frames for more soapboxing about the benefits of lists.

Upvotes: 1

Related Questions