How to capture a specific value located between the h2 nodes of an HTML page?

Question

I an using the rvest package in R to capture a specific text located on a webpage. The text I am interested to capture is "Hotel ABC - An All-Inclusive Resort".

Its location within the html codes of the webpage is shown below:


Hotel
Hotel ABC - An All-Inclusive Resort

How can I use rvest to capture that specific text?

QHarr · Accepted Answer

You need to get the following sibling of the span, anchored by the parent h2 id.

library(rvest)

html <- '
Hotel
Hotel ABC - An All-Inclusive Resort
'

read_html(html) %>%
  html_node(xpath = "//*[@id='hp_hotel_name']/span/following-sibling::text()[1]") %>%
  html_text(trim = T)

How to capture a specific value located between the h2 nodes of an HTML page?

Answers (1)

Related Questions