Dave Sag
Dave Sag

Reputation: 13486

Why isn't Nokogiri's xpath working as expected?

I am parsing a Soap response with Nokogiri but for some reason the xpath or css methods can not find any tags beyond the <soap:Body> tag.

The XML I am trying to parse is

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <soap:Body>
        <AuthenticationResponse xmlns="http://tempuri.org/">
            <AuthenticationResult>
                <SessionID>clinTQYART6qxeQ%k^Am&amp;Sd5Co3</SessionID>
                <RequestStatus>1</RequestStatus>
                <RequestMessage>Success</RequestMessage>
            </AuthenticationResult>
        </AuthenticationResponse>
    </soap:Body>
</soap:Envelope>

If I inspect the parsed XML with a debugger I see

=> #(Document:0x3fce3c4dd95c {
  name = "document",
  children = [
    #(Element:0x3fce385b04dc {
      name = "Envelope",
      namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
      children = [
        #(Element:0x3fce385e509c {
          name = "Body",
          namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
          children = [
            #(Element:0x3fce385e4c64 {
              name = "AuthenticationResponse",
              namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
              children = [
                #(Element:0x3fce385e48a4 {
                  name = "AuthenticationResult",
                  namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
                  children = [
                    #(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
                    #(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
                    #(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
                  })]
              })]
          })]
      })]
  })

which is fine.

But xml.xpath("//SessionID") gives []

However xml.xpath("//soap:Body")[0] gives

=> #(Element:0x3fce385e509c {
  name = "Body",
  namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
  children = [
    #(Element:0x3fce385e4c64 {
      name = "AuthenticationResponse",
      namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
      children = [
        #(Element:0x3fce385e48a4 {
          name = "AuthenticationResult",
          namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
          children = [
            #(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
            #(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
            #(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
          })]
      })]
  })

and xml.xpath("//soap:Body")[0].children[0].children[0].children[0] gives

=> #(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] })

and consequently xml.xpath("//soap:Body")[0].children[0].children[0].children[0].content gives me the correct id string.

So why doesn't xml.xpath("//SessionID") work?

Upvotes: 4

Views: 278

Answers (2)

suranyami
suranyami

Reputation: 908

Not a direct answer to your question, but if you're wanting to parse SOAP, you'd be better off using the savon gem rather than nokogiri. It's specifically designed to handle all the intricacies of SOAP.

Upvotes: 0

Daniel Haley
Daniel Haley

Reputation: 52858

It's because SessionID is in the namespace http://tempuri.org/.

Try something like (untested):

xml.xpath("//x:SessionID", {"x" => "http://tempuri.org/"})

Upvotes: 3

Related Questions