Sushanta Deb
Sushanta Deb

Reputation: 539

readHTMLTable function not able to extract the html table

I would like to extract the table (table 4) from the URL "http://www.moneycontrol.com/financials/oilnaturalgascorporation/profit-loss/IP02". The catch is that I will have to use RSelenium

Now here is the code I am using:

remDr$navigate(URL)
doc<-htmlParse(remDr$getPageSource()[[1]])
x<-readHTMLTable(doc)

The above code is not able to extract the table 4. However when I do not use Rselenium like below, I am able to extract the table easily

download.file(URL,'quote.html')
doc<-htmlParse('quote.html')
x<-readHTMLTable(doc,which=5)

Please let me the solution as I have been stuck on this part for a month now. Appreciate your suggestions

Upvotes: 1

Views: 867

Answers (3)

David
David

Reputation: 1

I'm struggling with more or less the same issue: I'm trying to come up with a solution that doesn't use htmlParse: for example (after navigating to the page): table <- remDr$findElements(using = "tag name", value = "table"))

You might have to use css or xpath on yours, next step I'm still working on.

I finally got a table downloaded into a nice little data frame, It seems easy when you get it figured out. Using the help page from the XML package:

library(RSelenium)
library(XML)
u <- 'http://www.w3schools.com/html/html_tables.asp'
doc <- htmlParse(u)
tableNodes <- getNodeSet(do9c, "//table")
tb <- readHTMLTable(tableNodes[[1]])

Upvotes: 0

A Gore
A Gore

Reputation: 1910

I think it works fine. The table you were able to get using download.file can also be gotten by using the following code for RSelenium

readHTMLTable(htmlParse(remDr$getPageSource(),asText=TRUE),header=TRUE,which=6)

Hope that helps!

Upvotes: 1

Sushanta Deb
Sushanta Deb

Reputation: 539

I found the solution. In my case, I had to first navigate to the inner frame (boxBg1) before I could extract the outer html and then use readHtmlTable function. It works fine now. Will post in case I run into a similar issue in the future

Upvotes: 0

Related Questions