Reputation: 651
I know there's a great deal of resources/questions that deal with this subject but I have been trying for days and can't seem to figure it out. I have webscraped websites before but this one is causing me problems.
The website: njaqinow.net
What I want scraped: I would like to scrape the table under the "Current Status"->"Pollutants" tab. I would like to have this scraped every time the table is updated so I can use this information inside a shiny app I am creating.
What I have tried: I have tried numerous different approaches but for simplicity I will show my most recent approach:
library("rvest")
url<-"http://www.njaqinow.net"
webpage <- read_html(url)
test<-webpage%>%
html_node("table")%>%
html_table()
My guess is that this is way more complicated then I originally thought because it seems to me that the table is inside a frame. I am not a javascript/HTML pro so I am not entirely sure. Any help/guidance would be greatly appreciated!
Upvotes: 1
Views: 918
Reputation: 17699
I can contribute a solution with RSelenium. I would show you how to navigate to that table and get its content. For formatting the table content i provide a link to another question, but wont be in the scope of this answer.
I think you have two challenges. Switch into a frame and switching between frames.
Switch into a frame is done by remDr$switchToFrame()
.
Switching between frames is discussed here: https://github.com/ropensci/RSelenium/issues/155. In your case:
remDr$switchToFrame("contents")
...
remDr$switchToFrame(NA)
remDr$switchToFrame("contentsi")
Full code would read:
remDr$navigate("http://www.njaqinow.net")
frame1 <- remDr$findElement("xpath", "//frame[@id = 'contents']")
remDr$switchToFrame(frame1)
remDr$findElement("xpath", "//*[text() = 'Current Status']")$clickElement()
remDr$findElement("xpath", "//*[text() = 'POLLUTANTS']")$clickElement()
remDr$switchToFrame(NA)
remDr$switchToFrame("contentsi")
table <- remDr$findElement("xpath", "//table[@id = 'C1WebGrid1']")
table$getElementText()
For formatting a table you could look here: scraping table with R using RSelenium
Upvotes: 2