JeffWithpetersen
JeffWithpetersen

Reputation: 147

Loop through all Excel files in directory, amend data and save dataframe back to existing Excel file in R

What I am looking to do is loop through all Excel files in a directory and, where value in column 'Account.ID' is a match to my created 'IDList', changed column 'Type' value to 'REDACTED'. Dataframe then needs to be exported back to the original Excel file, saved, closed and on to the next file.

The below code works for obtaining the list of files within the directory, opening the first file and making the necessary amendment.

Can anyone help please with writing the data back to the original Excel file and ensuring the loop works through to completion so all files are amended?

#ID list contains thousands, simplified for ease of reading
IDList <- c("7c850aaa", "07311bbb",)

MYFILEPATH <- "\\\\dcf.network\\data\\\\R\\Test Folder"

# get a vector of all filenames
files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE
)

for (i in 1:length(files)) {
  data <- read.xlsx(files[i])
  
  cols <- c("Type")
  data[data$Account.ID %in% IDList, cols] <- "REDACTED"
    
}

Upvotes: 1

Views: 1857

Answers (1)

TrainingPizza
TrainingPizza

Reputation: 1150

Give the openxlsx package a try. I'm assuming there's only one sheet per excel file, but you could theoretically accommodate multiple sheets with the openxlsx package as well if the contents are data tables. For the one sheet solution, it should just be one additional line in your loop to overwrite the original file with the updated data. You can use the openxlsx package to read the data in as well.

install.packages("openxlsx", dependencies = TRUE)

#ID list contains thousands, simplified for ease of reading
IDList <- c("7c850aaa", "07311bbb")

MYFILEPATH <- "\\\\dcf.network\\data\\\\R\\Test Folder"

# get a vector of all filenames
files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE
)

for (i in 1:length(files)) {
  data <- openxlsx::read.xlsx(files[i])
  
  cols <- c("Type")
  data[data$Account.ID %in% IDList, cols] <- "REDACTED"
  
  openxlsx::write.xlsx(data, file = files[i], overwrite = TRUE)  
}

As an aside, it might be useful to add something like pattern = ".xlsx" to list.files depending on what is in the folders and how precise you need to be so that only excel files are read in. e.g.

files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE,
  pattern = ".xlsx"
)

Upvotes: 3

Related Questions