MidnightDataGeek
MidnightDataGeek

Reputation: 938

Looping through files using dynamic name variable in R

I have a large number of files to import which are all saved as zip files.

From reading other posts it seems I need to pass the zip file name and then the name of the file I want to open. Since I have a lot of them I thought I could loop through all the files and import them one by one.

Is there a way to pass the name dynamically or is there an easier way to do this?

Here is what I have so far:

Temp_Data <- NULL
Master_Data <- NULL


file.names <- c("f1.zip", "f2.zip", "f3.zip", "f4.zip", "f5.zip")

for (i in 1:length(file.names)) {
    zipFile <- file.names[i]
    dataFile <- sub(".zip", ".csv", zipFile)

    Temp_Data <- read.table(unz(zipFile, 
                            dataFile), sep = ",")

    Master_Data <- rbind(Master_Data, Temp_Data)

}

I get the following error:

In open.connection(file, "rt") :

I can import them manually using:

dt <- read.table(unz("D:/f1.zip", "f1.csv"), sep = ",")

I can create the sting dynamically but it feels long winded - and doesn't work when I wrap it with read.table(unz(...)). It seems it can't find the file name and so throws an error

cat(paste(toString(shQuote(paste("D:/",zipFile, sep = ""))),",",
      toString(shQuote(dataFile)), sep = ""), "\n")

But if I then print this to the console I get:

"D:/f1.zip","f1.csv"

I can then paste this into `read.table(unz(....)) and it works so I feel like I am close

I've tagged in data.table since this is what I almost always use so if it can be done with 'fread' that would be great.

Any help is appreciated

Upvotes: 0

Views: 2253

Answers (1)

Hassan.JFRY
Hassan.JFRY

Reputation: 1072

you can use the list.files command here:

first set your working directory, where all your files are stored there:

setwd("C:/Users/...")

then

file.names = list.files(pattern = "*.zip", recursive = F)

then your for loop will be:

for (i in 1:length(file.names)) {
#open the files

zipFile <- file.names[i]
dataFile <- sub(".zip", ".csv", zipFile)

Temp_Data <- read.table(unz(zipFile, 
                        dataFile), sep = ",")
# your function for the opened file
Master_Data <- rbind(Master_Data, Temp_Data)

#write the file finaly
write_delim(x = Master_Data, path = paste(file.names[[i]]), delim = "\t", 
col_names = T )}

Upvotes: 2

Related Questions