Reputation: 12178
I am trying to extract the latitudes and longitudes for the places listed on the right side of this page. I want to create a table like the following:
Place Latitude Longitude
Agarda 23.12604 87.19869
Ahanda 23.13099 87.18501
.....
.....
West-Sanabandh 23.24876 86.99941
Is it possible to do this in R without calling up the individual hyperlinks for "Agarda:, "Ahanda"... etc. one at a time?
Upvotes: 0
Views: 137
Reputation: 11
It's possible to use RCurl to scrape each page in some type of loop or sapply. If you combine it with some regex and/or readHTMLTable (to identify the hyperlinks) then it's a relatively straightforward function.
Within RCurl, it's possible to create a multicurl which will do this in parallel, although given the number of queries involved, it might be just as easy to serialise it and put a small system sleep between queries.
Upvotes: 1
Reputation: 943500
The data appears on different pages. You can't get that data without requesting each page.
If R supports threads then you can call them up in parallel rather than one at a time.
Upvotes: 3