How to get URL to downloaded source?

I want to download a file by R code which is downloaded by clicking on "download" button on this site: https://ivo.gascade.biz/ivo/capacities?9 enter image description here

Clicking "download" runs GET method https://ivo.gascade.biz/ivo/capacities?reportparameterselect_hf_0=&9-2.IFormSubmitListener-form=&netpoint=6800&flowDirection=EXIT&from=08%2F05%2F2019&to=06%2F05%2F2021&fileType=1&download=Download

but when I use:

url <- "https://ivo.gascade.biz/ivo/capacities?reportparameterselect_hf_0=&9-2.IFormSubmitListener-form=&netpoint=6800&flowDirection=EXIT&from=08%2F05%2F2019&to=06%2F05%2F2021&fileType=1&download=Download"
download.file(url, dest.file="myfile.csv")

then I download only html thrash. Any suggestions how to get a file with R code?

What is strange that when this returns ""

RCurl::getURL("https://ivo.gascade.biz/ivo/capacities?9")

Upvotes: 0

Views: 61

Answers (1)

user10191355
user10191355

Reputation:

They expect a cookie associated with a live session. The request URLs also appear to be different for each request even if the requested data are the same, but the cookies remain the same. If you have a live session in your browser, you can find the JSESSIONID cookies and current request URL under the request headers in the network tab. Pass them in to the header argument as a named vector:

cookie <- "JSESSIONID=5BD17…; JSESSIONID=57D9…"
download.file(url, "myfile.csv", headers = c("Cookie" = cookie))

However, this only seems to work while the page of interest is open in a browser and you've already filled out the form and clicked download, which obviously isn't very practical. I think your best bet in this case is to use a webdriver like RSelenium, which allows you to simulate browser activity programmatically.

There might also be way to create a more persistent connection using httr and adding some more header parameters (e.g. keepalive). But I suspect RSelenium might be the better choice here.

Upvotes: 1

Related Questions