Reputation: 686
I am trying to parse some XML content, in this case with some products:
<PRODUCTS>
<PRODUCT>
<NAME><![CDATA[Some name]]></NAME>
<CATEGORIES>
<CATEGORY>
<NAME><![CDATA[Category 1]]></NAME>
</CATEGORY>
<CATEGORY>
<NAME><![CDATA[Category 2]]></NAME>
</CATEGORY>
</CATEGORIES>
</PRODUCT>
<PRODUCT>
<NAME><![CDATA[Some other name]]></NAME>
<CATEGORIES>
<CATEGORY>
<NAME><![CDATA[Category 1]]></NAME>
</CATEGORY>
<CATEGORY>
<NAME><![CDATA[Category 2]]></NAME>
</CATEGORY>
</CATEGORIES>
</PRODUCT>
</PRODUCTS>
If I put the above into a doc
variable and call for the NAME
in each product:
doc.css("PRODUCT").each do |product|
puts product.css("NAME").size # => 3
end
I also get the nested NAME
elements of each product.
How do I get only the NAME
that is not nested? I know that product.at_css("NAME")
returns only the first element, but my question is not how to get the first element, but rather how to get elements that are not nested.
Upvotes: 2
Views: 77
Reputation: 66867
You can use >
to select only NAME
elements that are direct children of PRODUCT
:
doc.css("PRODUCT").each do |product|
puts product.css("> NAME")
end
This will output the following:
<NAME><![CDATA[Some name]]></NAME>
<NAME><![CDATA[Some other name]]></NAME>
Upvotes: 2
Reputation: 1407
Using XPath:
doc.xpath("PRODUCTS/PRODUCT").each do |product|
puts product.xpath("NAME").first
end
.xpath("NAME")
in this case returns only immediate descendants. Same effect can be achieved with css child selector.
doc.css("PRODUCT").each do |product|
puts product.css("> NAME").first
end
Upvotes: 0
Reputation: 111
You can use the following
doc.css("PRODUCT").each do |product|
puts product.css("NAME").first
end
Upvotes: 0