Reputation: 2142
I have an xml document like the following:
<doc>
<header>
<group>
<note>group note</note>
</group>
<note>header note</note>
</header>
</doc>
I want to retrieve the note elements that fall under header and not any note elements that fall under group.
I thought this should work but it also picks up the note under group:
doc.css('header note')
What is the syntax to only grab the note element that is the direct child of the header?
Upvotes: 0
Views: 247
Reputation: 160631
The simplest thing is to let Nokogiri find all header note
tags, then only use the last one:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<doc>
<header>
<group>
<note>group note</note>
<group>
<note>header note</note>
</header>
</doc>
EOT
doc.css('header note').last.text # => "header note"
Remember, css
, like its XPath counterpart xpath
, and the more generic search
, return NodeSets. NodeSets are like an Array in that you can slice it or use first
or last
with it.
Note though, you could just as easily use:
doc.css('note').last.text # => "header note"
Notice though, your XML is malformed. The <group>
tag isn't closed. Nokogiri is doing fixups to the XML, which can give you odd results. Check for that situation by looking at doc.errors
:
# => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: group line 5 and header>,
# #<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: group line 3 and doc>,
# #<Nokogiri::XML::SyntaxError: Premature end of data in tag header line 2>,
# #<Nokogiri::XML::SyntaxError: Premature end of data in tag doc line 1>]
Upvotes: 0
Reputation: 46846
You can use the >
in CSS-selectors to find child elements. This is in contrast to using a space, , which finds descendant elements.
In your case:
puts doc.css('header > note')
#=> "<note>header note</note>"
Upvotes: 1