Reputation: 5566
I have the following XML tree and need to get out the first name and surname only for the contrib
tags with child xref
nodes of ref-type
"corresp"
.
<pmc-articleset>
<article>
<front>
<article-meta>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Wereszczynski</surname>
<given-names>Jeff</given-names>
</name>
<xref rid="aff1" ref-type="aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Andricioaei</surname>
<given-names>Ioan</given-names>
</name>
<xref rid="aff1" ref-type="aff"/>
<xref ref-type="corresp" rid="cor1">*</xref>
</contrib>
</contrib-group>
</article-meta>
</front>
</article>
</pmc-articleset>
I saw "Getting the siblings of a node with Nokogiri" which points out the CSS sibling selectors that can be used in Nokogiri, but, following the example given, my code gives siblings indiscriminately.
require "Net/http"
require "nokogiri"
url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?id=PMC1637560&db=pmc"
xml_data = Net::HTTP.get_response(URI.parse(url)).body
parsedoc = Nokogiri::XML.parse(xml_data)
corrdetails = parsedoc.at('contrib:has(xref[text()="*"])')
puts surname = corrdetails.xpath( "//surname" ).text
puts givennames = corrdetails.xpath("//given-names").text
=> WereszczynskiAndricioaei
=> JeffIoan
I only want the sibling node under the condition that <xref ref-type="corresp">*</>
, that is an output of:
=> Andricioaei
=> Ioan
I've currently implemented this without referring to ref-type
but rather selecting the asterisk within the xref
tag (either is appropriate).
Upvotes: 0
Views: 1422
Reputation: 46836
The problem is actually with your XPath for getting the the surname and given name, i.e., the XPath is incorrect for the lines:
puts surname = corrdetails.xpath( "//surname" ).text
puts givennames = corrdetails.xpath("//given-names").text
Starting the XPath with //
means to look for the node anywhere in the document. You only want to look within the corrdetails
node, which means the XPath needs to start with a dot, e.g., .//
.
Change the two lines to:
puts surname = corrdetails.xpath( ".//surname" ).text
puts givennames = corrdetails.xpath(".//given-names").text
Upvotes: 2