Reputation: 331
I am trying to scrape this webpage using the following code.
library(XML)
url <- html("http://www.gallop.co.za/")
doc <- htmlParse(url)
lat <- xpathSApply(doc,path="//p[@id=Racecards]",fun = xmlGetAttr , name = 'Racecards')
I looked at the webpage and the table i want to scrape is the racecard table, primarily to get the links to where the racecard data is.
I used selector gadget which returns the xml path as:
//*[(@id = "Racecards")]
However, when i use the R code, it returns a zero list. It feels like i'm getting the xml path wrong somehow, what is the correct way to return the table but also return the links within the table?
Upvotes: 0
Views: 598
Reputation: 416
It seems that the data are transported through json
and use js
to insert into html. So you can't get the data from html
. You can get it directly from json
.
library(RCurl)
library(jsonlite)
p <- getURL("http://www.gallop.co.za/cache/horses.json")
fromJSON(p)
Upvotes: 1