How to find an element's text in Capybara while ignoring inner element text

Question

In the HTML example below I am trying to grab the $16.95 text in the outer span.price element and exclude the text from the inner span.sale one.


  
    "Low price!"
    "$16.95"

If I was using Nokogiri this wouldn't be too difficult.

price = doc.css('sale')
price.search('.sale-text').remove
price.text

However Capybara navigates rather than removes nodes. I knew something like price.text would grab text from all sub elements, so I tried to use xpath to be more specific. p.find(:xpath, "//span[@class='sale']", :match => :first).text. However this grabs text from the inner element as well.

Finally, I tried looping through all spans to see if I could separate the results but I get an Ambiguous error.

p.find(:css, 'span').each { |result| puts result.text }
Capybara::Ambiguous: Ambiguous match, found 2 elements matching css "span"

I am using Capybara/Selenium as this is for a web scraping project with authentication complications.

Thomas Walpole · Accepted Answer

There is no single statement way to do this with Capybara since the DOMs concept of innerText doesn't really support what you want to do. Assuming p is the '.price' element, two ways you could get what you want are as follows:

Since you know the node you want to ignore just subtract that text from the whole text
```
p.find('span.sale').text.sub(p.find('span.sale-text').text, '')
```
Grab the innerHTML string and parse that with Nokogiri or Capybara.string (which just wraps Nokogiri elements in the Capybara DSL)
```
doc = Capybara.string(p['innerHTML'])
nokogiri_fragment = doc.native
#do whatever you want with the nokogiri fragment
```

How to find an element's text in Capybara while ignoring inner element text

Answers (1)

Related Questions

How to find an element&#39;s text in Capybara while ignoring inner element text

Answers (1)

Related Questions

How to find an element's text in Capybara while ignoring inner element text