Reputation: 287
I am using RSelenium to download a number of .xls files. I was able to get a somewhat passable solution using the following code to set up the server, which specifies not to create a pop-up when I click on the download link and where to download the file to. However, without fail, once I download the 101st file (saved as "report (100).xls) the download pop-up begins appearing in the browser Selenium is driving.
eCaps <- list(
chromeOptions =
list(prefs = list(
"profile.default_content_settings.popups" = 0L,
"download.prompt_for_download" = FALSE,
"download.default_directory" = "mydownloadpath"
)
)
)
rd <- rsDriver(browser = "chrome", port=4566L, extraCapabilities = eCaps)
The function to download then looks like:
vote.downloading <- function(url){
#NB: this function assumes browser already up and running, options set correctly
Sys.sleep(1.5)
browser$navigate(url)
down_button <- browser$findElement(using="css",
"table:nth-child(4) tr:nth-child(3) a")
down_button$clickElement()
}
For reference, the sites I'm getting the download from look like this: http://www.moscow_city.vybory.izbirkom.ru/region/moscow_city?action=show&root=774001001&tvd=4774001137463&vrn=4774001137457&prver=0&pronetvd=null®ion=77&sub_region=77&type=427&vibid=4774001137463
The link being used for the download reads "Версия для печати" for those who don't know Russian.
I can't simply stop the function when the dialog begins popping up and pick up where I left off, because it's part of a larger function that scrapes links from drop-down menus that lead to the sites from the download link. This would also be extremely annoying, as there are 400+ files to download.
Is there some way I can alter the Chrome profile or my scraping function to prevent the system dialog from popping up every 101 files? Or is there a better way altogether to get these files downloaded?
Upvotes: 1
Views: 632
Reputation: 78842
No need for Selenium:
library(httr)
httr::GET(
url = "http://www.moscow_city.vybory.izbirkom.ru/servlet/ExcelReportVersion",
query = list(
region="77",
sub_region="77",
root="774001001",
global="null",
vrn="4774001137457",
tvd="4774001137463",
type="427",
vibid="4774001137463",
condition="",
action="show",
version="null",
prver="0",
sortorder="0"
),
write_disk("/tmp/report.xls"), ## CHANGE ME
verbose()
) -> res
I save it off to an object so you can run warn_for_status()
or other such checks.
It shld be straightforward to wrap that in a function with parameters to make it more generic.
Upvotes: 1