Yan Song
Yan Song

Reputation: 110

Use R to unzip and rename file

I have used R to download about 200 zip files. The zipped files are in mmyy.dat format. The next step is to use R to unzip all the files and rename it as yymm.txt. I know the function unzip can unpack the files. But I am not sure which argument in the function can change the name and format of the unzipped files as well.

And when I unzip the files using

 for (i in 1:length(destfile)){  
unzip(destfile[i],exdir='C:/data/cps1')
}

The files extrated are jan94pub.cps which is supposed to be jan94pub.dat. The code I use to download the files are here.

month_vec <- c('jan','feb','mar','apr','may', jun','jul','aug','sep','oct','nov','dec')
year_vec  <-      c('94','95','96','97','98','99','00','01','02','03','04','05','06','07','08','09','10','11','12','13','14')
url   <- "http://www.nber.org/cps-basic/"
month_year_vec <- apply(expand.grid(month_vec, year_vec), 1, paste, collapse="")
bab <-'pub.zip'
url1 <- paste(url,month_year_vec,bab,sep='')
for (i in 1:length(url1)){
destfile <- paste('C:/data/cps1/',month_year_vec,bab,sep='')
download.file(url1[i],destfile[i])
}
for (i in 1:length(destfile)){  
unzip(destfile[i],exdir='C:/data/cps1')
}

When I use str(destfile), the filenames are correct, jan94pub.dat. I don't see where my code goes wrong.

Upvotes: 0

Views: 4070

Answers (1)

Paul Hiemstra
Paul Hiemstra

Reputation: 60994

I'd do something like:

file_list = list.files('*zip')
lapply(file_list, unzip)

Next you want to use the same kind of lapply trick in combination with strptime to convert the name of the file to a date:

t = strptime('010101.txt', format = '%d%m%y.txt')   # Note I appended 01 (day) before the name, you can use paste for this (and its collapse argument)
[1] "2001-01-01"

You will need to tweak the filename a bit to get a reliable date, as only the month and the year is not enough. Next you can use strftime to transform it back to you desired yymm.txt format:

strftime(t, format = '%y%d.txt')
[1] "0101.txt"

Then you can use file.rename to perform the actual moving. To get this functionality into one function call, create a function which performs all the steps:

unzip_and_move = function(path) {
    # - Get file list
    # - Unzip files
    # - create output file list
    # - Move files
}

Upvotes: 1

Related Questions