Reputation: 35074
For example:
<p>
<b>Member Since:</b> Aug. 07, 2010<br><b>Time Played:</b> <span class="text_tooltip" title="Actual Time: 15.09:37:06">16 days</span><br><b>Last Game:</b>
<span class="text_tooltip" title="07/16/2011 23:41">1 minute ago</span>
<br><b>Wins:</b> 1,017<br><b>Losses / Quits:</b> 883 / 247<br><b>Frags / Deaths:</b> 26,955 / 42,553<br><b>Hits / Shots:</b> 690,695 / 4,229,566<br><b>Accuracy:</b> 16%<br>
</p>
I want to get 1,017
. It is a text after the tag, containing text Wins:
.
If I used regex, it would be [/<b>Wins:<\/b> ([^<]+)/,1]
, but how to do it with Nokogiri and XPath?
Or should I better parse this part of page with regex?
Upvotes: 1
Views: 222
Reputation: 27793
Here
doc = Nokogiri::HTML(html)
puts doc.at('b[text()="Wins:"]').next.text
Upvotes: 3
Reputation: 243449
Use:
//*[. = 'Wins:']/following-sibling::node()[1]
In case this is ambiguous (selects more than one node), more strict expressions can be specified:
//*[. = 'Wins:']/following-sibling::node()[self::text()][1]
Or:
(//*[. = 'Wins:'])[1]/following-sibling::node()[1]
Or:
(//*[. = 'Wins:'])[1]/following-sibling::node()[self::text()][1]
Upvotes: 0
Reputation: 24826
I would use pure XPath like:
"//b[.='Wins:']/following::node()[1]"
I've heard thousand of times (and from gurus) "never use regex to parse XML". Can you provide some "shocking" reference demonstrating that this sentence is not valid any more?
Upvotes: 1
Reputation: 56162
You can use this XPath: //*[*/text() = 'Wins:']/text()
It will return 1,017
.
About regex: RegEx match open tags except XHTML self-contained tags
Upvotes: 1