Reputation: 4453
I an using the rvest
package in R
to capture a specific text located on a webpage.
The text I am interested to capture is "Hotel ABC - An All-Inclusive Resort".
Its location within the html
codes of the webpage is shown below:
<h2 class="hp__hotel-name" id="hp_hotel_name">
<span class="hp__hotel-type-badge">Hotel</span>
Hotel ABC - An All-Inclusive Resort
</h2>
How can I use rvest to capture that specific text?
Upvotes: 1
Views: 247
Reputation: 84465
You need to get the following sibling of the span, anchored by the parent h2 id.
library(rvest)
html <- '<h2 class="hp__hotel-name" id="hp_hotel_name">
<span class="hp__hotel-type-badge">Hotel</span>
Hotel ABC - An All-Inclusive Resort
</h2>'
read_html(html) %>%
html_node(xpath = "//*[@id='hp_hotel_name']/span/following-sibling::text()[1]") %>%
html_text(trim = T)
Upvotes: 2