89_Simple
89_Simple

Reputation: 3805

Download data from URL

It does not really align with stackoverflow policy since I am not showing what I have done but I really have no clue how to even start on this question given my lack of technical expertise. Hope someone can post a solution or at least point me to the right direction.

I want to download all the data from this website:

http://aps.dac.gov.in/APY/Public_Report1.aspx

I need to download all the data i.e. all season * all year * all states * all crops. The longer (frustrating!) way to approach is to just click all the boxes and press download.

However, I was wondering if anyone has any programming solution to download this data. I would preferably want to do this in R because that's the language I understand but feel free to tag other programming languages.

Upvotes: 0

Views: 607

Answers (1)

johnrroby
johnrroby

Reputation: 88

Here's a solution using RSelenium to instance a browser and direct it to do your bidding.

library(RSelenium)
driver <- rsDriver()
remDr <- driver[["client"]]
remDr$navigate("http://aps.dac.gov.in/APY/Public_Report1.aspx") #navigate to your page

You basically need to tell the browser to select each button you want to mark, using SelectorGadget to find the unique ID for each, then pass them one-by-one to webElem. Then use the webElem methods to make the page do things.

webElem <- remDr$findElement(using = 'id', value = "TreeViewSeasonn0CheckBox")
webElem$highlightElement() #quick flash as a check we're in the right box
webElem$clickElement() #performs the click
#now do the same for each other box

webElem <- remDr$findElement(using = 'id', value = "TreeView1n0CheckBox") 
webElem$highlightElement()
webElem$clickElement()

webElem <- remDr$findElement(using = 'id', value = "TreeView2n0CheckBox") 
webElem$highlightElement()
webElem$clickElement()

webElem <- remDr$findElement(using = 'id', value = "TreeViewYearn0CheckBox")
webElem$highlightElement()
webElem$clickElement()

Now choose the report form you want and click the download button. Assuming it's Excel format here.

webElem <- remDr$findElement(using = 'id', value = "DdlFormat")
webElem$sendKeysToElement(list("Excel", key = "enter"))
webElem <- remDr$findElement(using = 'id', value = "Button1")
webElem$clickElement() #does the click

For what it's worth, the site timed out on trying to download all the data for me. Your results may vary.

Upvotes: 3

Related Questions