csmontt
csmontt

Reputation: 624

Can´t unzip file in R

I downloaded some data from certain URL but I´m not able to unzip any of the files downloaded and I can´t understand why. The code for downloading the data follows.

library(downloader)
path <- getwd()
for(i in 1:15){
fileName <- sprintf("%02d",i)
if (!file.exists(paste0(fileName,".zip"))) {
urlFile = paste0("http://www.censo2017.cl/wp-content/uploads/2016/12/R", 
fileName,".zip")
download(urlFile, dest = paste0("./R",fileName, ".zip"), mode ="wb")
 }
}

Then I have 15 zip files named: R01.zip R02.zip ... and so on, but when I use

unzip(R01.zip)

or try to unzip any other file, I get the following error Warning message: In unzip("R01.zip") : error 1 in extracting from zip file

I´ve read related StackOverflow posts such as this one or this one but no solution works in my case.

I can unzip the files manually, but I would like to do it directly within RStudio. Any ideas?

PD: The .zip files contain geographical data by the way, that is, .dbf, .prj, .shp files, etc.

Thanks!

Upvotes: 1

Views: 4680

Answers (2)

csmontt
csmontt

Reputation: 624

Ok, so based on this post I was able to workaround a solution.

Since the files were not actually .zip files and since 7-zip supported the extraction of the files manually, I looked for a way of calling 7-zip within R. The link I just posted shows how to do that.

I modified my code, now the files are downloaded and unzipped automatically.

# load neccesary packages
library(downloader)
library(installr)
install.7zip(page_with_download_url = "http://www.7-zip.org/download.html")

# download data and unzipped data
path <- getwd()
for(i in 1:15){   # the files correspond to administrative regions of Chile
                  # there are fifteen of them and they are ordered.
fileName <- sprintf("%02d",i) # adding leading zeros to the index if 
                              # the index number is of one digit
if (!file.exists(paste0("R",fileName,".zip"))) { # download only 
                                                 # if file is not already
                                                 # downloaded
urlFile = paste0("http://www.censo2017.cl/wp-content/uploads/2016/12/R", 
          fileName,".zip") # specifying url address
download(urlFile, dest = paste0("./R",fileName, ".zip"), mode ="wb")
} # download file
if (!file.exists(paste0("R",fileName))){ # if file is not already unzipped,
                                         # unzip it
z7path = shQuote('C:\\Program Files (x86)\\7-Zip\\7z')
file = paste0(getwd(), "/", "R", fileName, ".zip")
cmd = paste0(z7path, ' e ', file, ' -y -o', paste0(getwd(),"/R", fileName), 
      '/')
shell(cmd)
 }
}

It would be awesome if someone can tell me if this solution works for you too!

Upvotes: 2

Spacedman
Spacedman

Reputation: 94182

they're not zip files, they are RAR archives:

$ unrar v 01.zip

UNRAR 5.00 beta 8 freeware      Copyright (c) 1993-2013 Alexander Roshal

Archive: 01.zip
Details: RAR 4

 Attributes      Size    Packed Ratio   Date   Time   Checksum  Name
----------- ---------  -------- ----- -------- -----  --------  ----
    ..A....      1213       240  19%  23-11-16 16:12  C6C40C6D  R01/Comuna.dbf
    ..A....       151       138  91%  23-11-16 16:12  A3C83CE4  R01/Comuna.prj
    ..A....       212       165  77%  23-11-16 16:12  01752C2A  R01/Comuna.sbn
    ..A....       132       101  76%  23-11-16 16:12  C4CA93A2  R01/Comuna.sbx

I don't know if there's an R function for extracting RAR archives.

They probably shouldn't have .zip file extensions, but .rar instead. I've extracted the above using unrar on the command line.

Upvotes: 4

Related Questions