Reputation: 1223
I've reviewed multiple StackOverflow questions and answers and still can't exclusively use R get a .zip
file successfully downloaded, unzipped, and loaded in R.
When I download the .zip
folder manually, I see that it contains multiple files, one named loan.csv
, that I need to analyze in R.
#set wd
wd <- "/Users/myname/Documents/zip_folder"
setwd(wd)
zip_url <- "https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip"
I'm getting an error with the first answer I found here:
library(utils)
temp <- tempfile()
download.file(zip_url, temp)
data <- read.table(unz(temp, "loan.csv"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
cannot open zip file '/var/folders/b1/d481ykzd3j14kr8nkx8kn83m0000gn/T//RtmpcjmrIa/file932f730721c5'
unlink(temp)
Error in fread(unz(temp, "loan.csv")) :
'input' must be a single character string containing a file name, a command, full path to a file, a URL starting 'http[s]://', 'ftp[s]://' or 'file://', or the input data itself
I'm also getting an error using the 5th answer (Mac specific) to the SO question hyperlinked above:
loans <- fread("curl https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip | tar -xf- --to-stdout *loan.csv")
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0100 149 100 149 0 0 334 0 --:--:-- --:--:-- --:--:-- 334
tar: Unrecognized archive format
tar: *loans.csv: Not found in archive
tar: Error exit delayed from previous errors.
Error in fread("curl https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip | tar -xf- --to-stdout *loans.csv") :
File is empty: /var/folders/b1/d481ykzd3j14kr8nkx8kn83m0000gn/T//RtmpcjmrIa/file932f299c7cc4
Upvotes: 0
Views: 1004
Reputation: 546183
The multiple failures have various reasons:
fread
doesn’t work with unz
. It does work with read.table
.fread
does work with more extensive shell commands, but you cannot untar
a ZIP file because it’s not a TAR archive. You can use funzip
, as suggested in the same answer (but only if your ZIP archive contains just a single file).… you could also simply use the unzip
R function.
Upvotes: 1