x1carbon
x1carbon

Reputation: 297

R - Web scraping and downloading multiple zip files and save the files without overwriting

Trying to download multiple zip files using a web link. With this approach, the download files are getting overwritten since the file names are same for multiple years -

library(rvest)

url <- "https://download.open.fda.gov/"
page <- read_html(url)

zips <- grep("\\/drug-event",html_nodes(page,"key"), value=TRUE)
zips_i<-gsub(".*\\/drug\\/","drug/",zips)
zips_ii<-gsub("</key>","",zips_i)
zips_iii<-paste0(url, zips_ii)

lapply(zips_iii, function(x) download.file(x, basename(x)))

Is there a way not to overwrite the downloaded files?

Upvotes: 0

Views: 864

Answers (1)

x1carbon
x1carbon

Reputation: 297

Here is what I got so far -

#load the library
library(rvest)

#link to get the data from
url <- "https://download.open.fda.gov/"
page <- read_html(url)

#clean the URL
zips <- grep("\\/drug-event",html_nodes(page,"key"), value=TRUE)
zips_i<-gsub(".*\\/drug\\/","drug/",zips)
zips_ii<-gsub("</key>","",zips_i)
zips_iii<-paste0(url, zips_ii)

#destination vectors
id=1:length(zips_iii)
destination<-paste0("~/Projects/Projects/fad_ade/",id)

#file extraction
mapply(function(x, y) download.file(x,y, mode="wb"),x = zips_iii, y = destination)

Upvotes: 1

Related Questions