Jared Eitnier
Jared Eitnier

Reputation: 7152

PHP Parse XML response with many namespaces

Is there a way to parse through an XML response in PHP, taking into account all namespaced nodes and convert it to an object or array without knowing all the node names?

For example, converting this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<serv:message xmlns:serv="http://www.webex.com/schemas/2002/06/service"
    xmlns:com="http://www.webex.com/schemas/2002/06/common"
    xmlns:att="http://www.webex.com/schemas/2002/06/service/attendee">
    <serv:header>
        <serv:response>
            <serv:result>SUCCESS</serv:result>
            <serv:gsbStatus>PRIMARY</serv:gsbStatus>
        </serv:response>
    </serv:header>
    <serv:body>
        <serv:bodyContent xsi:type="att:lstMeetingAttendeeResponse"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <att:attendee>
                <att:person>
                    <com:name>James Kirk</com:name>
                    <com:firstName>James</com:firstName>
                    <com:lastName>Kirk</com:lastName>
                    <com:address>
                        <com:addressType>PERSONAL</com:addressType>
                    </com:address>
                    <com:phones />
                    <com:email>[email protected]</com:email>
                    <com:type>VISITOR</com:type>
                </att:person>
                <att:contactID>28410622</att:contactID>
                <att:joinStatus>INVITE</att:joinStatus>
                <att:meetingKey>803754412</att:meetingKey>
            </att:attendee>
        </serv:bodyContent>
    </serv:body>
</serv:message>

to something like:

['message' => [
    'header' => [
        'response' => [
            'result' => 'SUCCESS',
            'gsbStatus' => 'PRIMARY'
        ]
    ],
    'body' => [
        'bodyContent' => [
            'attendee' => [
                'person' => [
                    'name' => 'James Kirk',
                    'firstName' => 'James',
                    ...
                ],
                'contactID' => 28410622,
                ...
            ]
        ]
    ]
]

I know it's easy with non-namespaced nodes, but I don't know where to begin on something like this.

Upvotes: 1

Views: 987

Answers (2)

hakre
hakre

Reputation: 197757

(Read @ThW's answer about why an array is actually not that important to aim for)

I know it's easy with non-namespaced nodes, but I don't know where to begin on something like this.

It's as easy as with namespaced nodes because technically those are the same. Let's give a quick example, the following script loops over all elements in the document regardless of namespace:

$result = $xml->xpath('//*');
foreach ($result as $element) {
    $depth = count($element->xpath('./ancestor::*'));
    $indent = str_repeat('  ', $depth);
    printf("%s %s\n", $indent, $element->getName());
}

The output in your case is:

 message
   header
     response
       result
       gsbStatus
   body
     bodyContent
       attendee
         person
           name
           firstName
           lastName
           address
             addressType
           phones
           email
           type
         contactID
         joinStatus
         meetingKey

As you can see you can iterate over all elements as if they would not have any namespace at all.

But as it has been outlined, when you ignore the namespace you'll also loose important information. For example with the document you have you're actually interested in the attendee and common elements, the service elements deal with the transport:

$uriAtt = 'http://www.webex.com/schemas/2002/06/service/attendee';
$xml->registerXPathNamespace('att', $uriAtt);

$uriCom = 'http://www.webex.com/schemas/2002/06/common';
$xml->registerXPathNamespace('com', $uriCom);

$result = $xml->xpath('//att:*|//com:*');
foreach ($result as $element) {
    $depth  = count($element->xpath("./ancestor::*[namespace-uri(.) = '$uriAtt' or namespace-uri(.) = '$uriCom']"));
    $indent = str_repeat('  ', $depth);
    printf("%s %s\n", $indent, $element->getName());
}

The exemplary output this time:

 attendee
   person
     name
     firstName
     lastName
     address
       addressType
     phones
     email
     type
   contactID
   joinStatus
   meetingKey

So why drop all the namespaces? They help you to obtain the elements you're interested in. You can also do it dynamically

Upvotes: 3

ThW
ThW

Reputation: 19492

Don't us a generic conversion to an array. Just load and read it. It is not that difficult if you use DOM+XPath.

A generic conversion means that you loose information (the namespaces) and functionality (XPath).

First create a DOM and load the XML:

$dom = new DOMDocument();
$dom->loadXml($xml);

Now create a DOMXPath instance for the DOM and register prefixes for the namespaces. This can be the prefixes from the XML document or different ones.

$xpath = new DOMXPath($dom);
$xpath->registerNamespace('serv', 'http://www.webex.com/schemas/2002/06/service');
$xpath->registerNamespace('com', 'http://www.webex.com/schemas/2002/06/common');
$xpath->registerNamespace('att', 'http://www.webex.com/schemas/2002/06/service/attendee');

Use the registered prefixes in XPath expression to fetch values and nodes:

var_dump(
  $xpath->evaluate('string(/serv:message/serv:header/serv:response/serv:result)')
);

Output:

string(7) "SUCCESS"

Fetch all attendee elements and output the names:

foreach ($xpath->evaluate('/serv:message/serv:body/serv:bodyContent/att:attendee') as $attendee) {
  var_dump(
   $xpath->evaluate('string(att:person/com:name)', $attendee)
  );
};

Output:

string(10) "James Kirk"

Upvotes: 2

Related Questions