Reputation: 513
I am trying to scrape this website link using RSelenium. I have successfully scraped most of the contents on the page but was trying to get through to the "facility visits" and "facility complaints". Since both of those buttons have a javascript href when I inspect them with developer tools I have been using phantomjs and RSelenium.
I can successfully navigate to the page via phantom but whenever I try to extract the text from the fields using $getElementText, I get thrown the following error:
Selenium message:{"errorMessage":"Element does not exist in cache","request":{"headers":{"Accept":"application/json, text/xml, application/xml, */*","Accept-Encoding":"gzip, deflate","Host":"localhost:4444","User-Agent":"libcurl/7.53.1 r-curl/2.6 httr/1.2.1"},"httpVersion":"1.1","method":"GET","url":"/attribute/id","urlParsed":{"anchor":"","query":"","file":"id","directory":"/attribute/","path":"/attribute/id","relative":"/attribute/id","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/attribute/id","queryKey":{},"chunks":["attribute","id"]},"urlOriginal":"/session/c0f30500-55d0-11e7-96dd-3b147ee40d88/element/:wdc:1497974074536/attribute/id"}}
Show Traceback
Error: Summary: StaleElementReference Detail: An element command failed because the referenced element is no longer attached to the DOM. class: org.openqa.selenium.StaleElementReferenceException Further Details: run errorDetails method
and when I use $currentURL and $screenship(display = T) it shows the correct website rendered and the correct link.
I know it has something to do with how elements are attached to the DOM but I am not sure how to resolve the issue in R
Code below:
url <- "https://dhs.arkansas.gov/dccece/cclas/FacilityInformation.aspx?FacilityNumber=23516"
rd<-remoteDriver(browserName = 'phantomjs')
rd$open()
rd$navigate(url)
webElem<- rd$findElement(using="xpath", value = '//*[@id="ctl00_ContentPlaceHolder1_lbtnVisits"]')
webElem$clickElement()
webElem$findElements('css',"#aspnetForm > div.page > div.main")
webElem$getElementAttribute("id")
Upvotes: 1
Views: 1330
Reputation: 6551
You are probably getting a StaleElementReference
as a result of clicking the webElem
.
The webElem
element is likely modified in the DOM after the click, so if you try to "use" webElem
again, it is no longer attached to the DOM and is considered "stale".
An easy fix is to simply re-locate webElem
after it is clicked:
webElem <- rd$findElement(...
webElem$clickElement()
webElem <- rd$findElement(... # re-locate webElem
webElem$findElements('css',"#aspnetForm > div.page > div.main")
Upvotes: 2