Reputation: 63
I've used the following R script:
url="http://stats.espncricinfo.com/ci/engine/player/253802.html?class=3;orderby=default;template=results;type=batting"
check=readHTMLTable(url,header = T)
check$"Career summary"
check<-check$"Career summary"
I'm only able to scrape first 11 observations.
Can anyone suggest why i'm unable to scrape entire table?
Upvotes: 0
Views: 603
Reputation: 13680
AS @Wietze314 said there are more than one table on that page. You can get a list of all the table I suppose you are interested in with:
url="http://stats.espncricinfo.com/ci/engine/player/253802.html?class=3;
orderby=default;template=results;type=batting"
check=htmlParse(url)
tableNodes <- getNodeSet(check, '//tbody')
tbList <- lapply(tableNodes, readHTMLTable)
tbList
contains 22 data.frames for you to work with
Upvotes: 0
Reputation: 6020
To get the content of all tables on the page:
library(XML)
url="http://stats.espncricinfo.com/ci/engine/player/253802.html?class=3;orderby=default;template=results;type=batting"
content <- htmlParse(url)
tbody <- xpathSApply(content, "//tbody")
lapply(tbody, function(x) readHTMLTable(x, header=T))
Upvotes: 1