hirani89
hirani89

Reputation: 316

Convert XML with name spaces to json in php/Laravel

I have spent two days trying to convert XML to json with no luck.

I have tried the usual simplexml_load_string, XMLReader and SimpleXMLElement

I am able to read the data using `XMLReader when SearchResults has only one item. but when multiple items are returned, I am not sure what to do.

```

<?xml version="1.0" encoding="utf-8"?>
<DataSet xmlns="http://exampleurl.com/">
  <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
    <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
      <xs:complexType>
        <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element name="SearchResults">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="VehicleID" type="xs:long" minOccurs="0" />
                <xs:element name="make" type="xs:string" minOccurs="0" />
                <xs:element name="model" type="xs:string" minOccurs="0" />
                <xs:element name="series" type="xs:string" minOccurs="0" />
                <xs:element name="engine" type="xs:string" minOccurs="0" />
                <xs:element name="yearrange" type="xs:string" minOccurs="0" />
                <xs:element name="details" type="xs:string" minOccurs="0" />
                <xs:element name="chassis" type="xs:string" minOccurs="0" />
                <xs:element name="Countryoforigin" type="xs:string" minOccurs="0" />
                <xs:element name="VIN" type="xs:string" minOccurs="0" />
              </xs:sequence>
            </xs:complexType>
          </xs:element>
          <xs:element name="ChargeDetails">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="ChargeApplies" type="xs:long" minOccurs="0" />
              </xs:sequence>
            </xs:complexType>
          </xs:element>
          <xs:element name="vehicleRawDetails">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="licenceplate" type="xs:string" minOccurs="0" />
                <xs:element name="VIN" type="xs:string" minOccurs="0" />
                <xs:element name="make" type="xs:string" minOccurs="0" />
                <xs:element name="model" type="xs:string" minOccurs="0" />
                <xs:element name="submodel" type="xs:string" minOccurs="0" />
                <xs:element name="year" type="xs:string" minOccurs="0" />
                <xs:element name="bodystyle" type="xs:string" minOccurs="0" />
                <xs:element name="vehicletype" type="xs:string" minOccurs="0" />
                <xs:element name="chassisnumber" type="xs:string" minOccurs="0" />
                <xs:element name="enginenumber" type="xs:string" minOccurs="0" />
                <xs:element name="cc" type="xs:string" minOccurs="0" />
                <xs:element name="countryoforigin" type="xs:string" minOccurs="0" />
                <xs:element name="fueltype" type="xs:string" minOccurs="0" />
                <xs:element name="transmission" type="xs:string" minOccurs="0" />
                <xs:element name="speeds" type="xs:string" minOccurs="0" />
                <xs:element name="modelcode" type="xs:string" minOccurs="0" />
              </xs:sequence>
            </xs:complexType>
          </xs:element>
        </xs:choice>
      </xs:complexType>
    </xs:element>
  </xs:schema>
  <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
    <NewDataSet xmlns="">
      <SearchResults diffgr:id="SearchResults1" msdata:rowOrder="0">
        <VehicleID>13971</VehicleID>
        <make>SUBARU</make>
        <model>LEGACY</model>
        <series>BE</series>
        <engine>1994cc, EJ204 F4 16v DOHC MPFI {114KW}</engine>
        <yearrange>12/98~04/03</yearrange>
        <details>  B4 RS, 4D Sedan, AWD, AT/MT</details>
        <chassis>BE5</chassis>
        <Countryoforigin>JAPAN</Countryoforigin>
        <VIN>BE5</VIN>
      </SearchResults>
      <SearchResults diffgr:id="SearchResults2" msdata:rowOrder="1">
        <VehicleID>14379</VehicleID>
        <make>SUBARU</make>
        <model>LEGACY</model>
        <series>BE</series>
        <engine>1994cc, EJ208 F4 16v DOHC Twin Turbo MPFI {206KW}</engine>
        <yearrange>12/98~04/03</yearrange>
        <details>  B4 RSK, 4D Sedan, AWD, MT</details>
        <chassis>BE5</chassis>
        <Countryoforigin>JAPAN</Countryoforigin>
        <VIN>BE5</VIN>
      </SearchResults>
      <ChargeDetails diffgr:id="ChargeDetails1" msdata:rowOrder="0">
        <ChargeApplies>1</ChargeApplies>
      </ChargeDetails>
    </NewDataSet>
  </diffgr:diffgram>
</DataSet>

```

I can verify that the XML is correct because I tried an online converter and the results were usable. Whereas I get empty arrays or empty objects.

Can someone please help out?

Upvotes: 0

Views: 801

Answers (1)

Pranav Mandlik
Pranav Mandlik

Reputation: 644

So that XML to JSON function I found does work great, but it wouldn't work for my needs because it incorporates the name space into the key name, and I need the name space removed from the keys.

The problem is when you incorporate Namespaces in XML elements the conversion to an Array or JSON the child nodes that are namespaced are removed. I realized if I remove the namespace from the XML string before I convert the string to an XML Element, then the json_encode function works as expected and no data is removed.

So, for anyone else having this issue, here's is how I solved the problem for my needs.

I know that the XML being sent to me has no naming collisions. The only reason Namespaces are used are to identify the source of that portion of the XML data. I know all the Namespaces being used Having that information here is what I did

function removeNamespaceFromXML( $xml )
{
// Because I know all of the the namespaces that will possibly appear in 
// in the XML string I can just hard code them and check for 
// them to remove them
$toRemove = ['rap', 'turss', 'crim', 'cred', 'j', 'rap-code', 'evic'];
// This is part of a regex I will use to remove the namespace declaration from string
$nameSpaceDefRegEx = '(\S+)=["\']?((?:.(?!["\']?\s+(?:\S+)=|[>"\']))+.)["\']?';

// Cycle through each namespace and remove it from the XML string


foreach( $toRemove as $remove ) {
        // First remove the namespace from the opening of the tag
        $xml = str_replace('<' . $remove . ':', '<', $xml);
        // Now remove the namespace from the closing of the tag
        $xml = str_replace('</' . $remove . ':', '</', $xml);
        // This XML uses the name space with CommentText, so remove that too
        $xml = str_replace($remove . ':commentText', 'commentText', $xml);
        // Complete the pattern for RegEx to remove this namespace declaration
        $pattern = "/xmlns:{$remove}{$nameSpaceDefRegEx}/";
        // Remove the actual namespace declaration using the Pattern
        $xml = preg_replace($pattern, '', $xml, 1);
    }

// Return sanitized and cleaned up XML with no namespaces
return $xml;
}

function namespacedXMLToArray($xml)
{
    // One function to both clean the XML string and return an array
    return json_decode(json_encode(simplexml_load_string(removeNamespaceFromXML($xml))), true);
}

By calling the namespacedXMLToArray() function I can simply get an array that is 100% good to go in my case.

Hopefully this approach helps others. I am sure if you don't know what possible namespaces exist you can use a RegEx to find the various defined namespaces and then remove them once you know their names.

Upvotes: 3

Related Questions