Vis Bandla
Vis Bandla

Reputation: 39

Navigate to a link using html session in R

I am trying to navigate to a link on a website. All the links work except for one single link. Here are the results.

> mcsession<-html_session("http://www.moneycontrol.com/financials/tataconsultancyservices/balance-sheetVI/TCS#TCS")

> mcsession<-mcsession %>% follow_link("Previous Years »")
Error: No links have text 'Previous Years »'
In addition: Warning message:
In grepl(i, text, fixed = TRUE) : input string 316 is invalid UTF-8

> mcsession<-mcsession %>% follow_link("Balance Sheet")
Navigating to /financials/tataconsultancyservices/balance-sheetVI/TCS#TCS
Warning message:
In grepl(i, text, fixed = TRUE) : input string 316 is invalid UTF-8

Any idea why this happens so?

Upvotes: 0

Views: 560

Answers (1)

Andrew Gustar
Andrew Gustar

Reputation: 18425

It is not a normal link - it is javascript. I don't know of a way of doing it with rvest, but you could use RSelenium, which basically automates a normal browser window. It is slower than scraping directly, but you can automate just about anything that you can do by hand. This works for me (using chrome on Windows 10)...

library(RSelenium)
rD <- rsDriver(port=4444L,browser="chrome")
remDr <- rD$client

remDr$navigate("http://www.moneycontrol.com/financials/tataconsultancyservices/balance-sheetVI/TCS#TCS")

firstpage <- remDr$getPageSource() #you can use this to get the first table

#(1)
webElem1 <- remDr$findElement(using = 'partial link text', value = "Previous Years")
webElem1$clickElement()

nextpage <- remDr$getPageSource() #you can use this to get the next page for previous years

#repeat from #(1) to go back another page etc 

remDr$closeall() #when you have finished.

Upvotes: 2

Related Questions