Patrick Stueck
Patrick Stueck

Reputation: 3

Nokogiri XML Searching

I've tried reading the Nokogiri docs, etc, but I've came to a road block.

I get an XML output similar to

<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>

But I have a large # of 'return' sections that get displayed back. I'm trying to use the .search at the end of string. I'm only wanting it to return the entire 'return' section for a given 'name'. Anyone have any tips?

Current Code:

client = Savon::Client.new do
  http.auth.basic "#{opts['api_username']}", "#{opts['api_password']}"
  wsdl.document = "#{opts['api_url']}/Policy?wsdl"
end

getPolicyInformation = client.request :getPolicies
getPolicyInformation = Nokogiri::XML(getPolicyInformation.to_xml)
print getPolicyInformation

I'm wanting to return everything in the <return> section if I search for a specified <name>. Example: I only want to see the information relating to <name>pstueck-passenger ondemand</name>, but the entire <return> section that contains that.

Upvotes: 0

Views: 1370

Answers (2)

the Tin Man
the Tin Man

Reputation: 160551

Using CSS to find the node:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>
EOT

return_tag = doc.at('return name[text()="pstueck-passenger ondemand"]').parent

puts return_tag.to_xml

Which outputs:

<return>
  <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
  <description/>
  <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
  <enabled>true</enabled>
  <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
  <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
  <name>pstueck-passenger ondemand</name>
  <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
  <state>OK</state>
</return>

Nokogiri supports both XPath and CSS. I find CSS easier to read.

I used the at method to find the first matching occurrence, and to show that it was the first matching, I swapped the order of the two <return> blocks. at is the same as search(...).first so when you're looking for the first instance of something in a document at is the way to go.

Nokogiri is usually smart enough to know the difference between XPath and CSS selectors, so we can use the generic at and search. If you need to force CSS or XPath parsing because the selector is gender-unspecific, you can use the specific css or xpath or at_css or at_xpath respectively. They're all documented in the Nokogiri::XML::Node docs.

parent is necessary because we want the parent of the selected node, which was <name>. I just slammed it into reverse and backed up a block. That is easier to do in XPath, where we can use .. to point to the parent node.

Upvotes: 1

ezkl
ezkl

Reputation: 3851

You can use XPath to identify a node with a particular value and then specify that an ancestor element is of interest by doing something like the following:

require 'nokogiri'

document = <<-XML
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>
XML

doc = Nokogiri::XML(document)
ns = { 'soap' => 'http://schemas.xmlsoap.org/soap/envelope/', 'ns1' => "http://policy.api.control.r1soft.com/" }
ret = doc.xpath('/soap:Envelope/soap:Body/ns1:getPoliciesResponse/return/name[text()="pstueck-passenger ondemand"]/ancestor::return', ns)

puts ret.count
puts ret.at('replicationScheduleFrequencyType').text

EDIT

Updated to reflect updated XML body in question. Now handles namespaces.

Upvotes: 1

Related Questions