Apoorv
Apoorv

Reputation: 179

blank value captures while scraping using Rselenium

I am trying to scrape a textbox value from the URL in the code. I picked the css using slector gadget. It is not able to capture the content in the text box. Tested several other CSS toobut the textbox value is not captured. Text box is : construction year Please help . Below is the code for reference.

url = "https://www.ncspo.com/FIS/dbBldgAsset_public.aspx?BldgAssetID=8848"
values = list()
remDr$navigate(url)
page_source<-remDr$getPageSource()
a = read_html(page_source[[1]])
=        html_nodes(a,"#ctl00_mainContentPlaceholder_txtConstructionYear_iu")

values = html_text(html_main_node)
values

Thanks in advance

Upvotes: 0

Views: 70

Answers (2)

Bharath
Bharath

Reputation: 1618

The above answer also works. But if you are only trying to use RSelenium. Here is the code

library(RSelenium)
checkForServer()
startServer()
Sys.sleep(5)
re<-remoteDriver()
re$open()
re$navigate("https://www.ncspo.com/FIS/dbBldgAsset_public.aspx?BldgAssetID=8848")
re$findElement(using = "css selector", "#ctl00_mainContentPlaceholder_txtConstructionYear_iu")$clickElement()
text<-unlist(re$findElement(using = "css selector", "#ctl00_mainContentPlaceholder_txtConstructionYear_iu")$getElementAttribute("value"))

This works

Upvotes: 0

hrbrmstr
hrbrmstr

Reputation: 78792

Why RSelenium? It scrapes fine with rvest (though it is a horrible SharePoint site which may cause problems down the end with maintaining the proper view state cookies).

library(rvest)

pg <- html_session("https://www.ncspo.com/FIS/dbBldgAsset_public.aspx?BldgAssetID=8848")

html_attr(html_nodes(pg, "input#ctl00_mainContentPlaceholder_txtConstructionYear_iu"), "value")

## [1] 1987

You should be grabbing the value attribute vs the node text. This should work in the your selenium code, too.

Upvotes: 1

Related Questions