Reputation: 4541
I have this method that converts an XML string into a PHP array with different keys and values to fully make sense of that XML appropriately. However, when there are multiple children of the same kind, I'm not getting the desired result from the array and I'm confused on how to alter the method to do so.
This is what the method looks like:
/**
* Converts a XML string to an array
*
* @param $xmlString
* @return array
*/
private function parseXml($xmlString)
{
$doc = new DOMDocument;
$doc->loadXML($xmlString);
$root = $doc->documentElement;
$output[$root->tagName] = $this->domnodeToArray($root, $doc);
return $output;
}
/**
* @param $node
* @param $xmlDocument
* @return array|string
*/
private function domNodeToArray($node, $xmlDocument)
{
$output = [];
switch ($node->nodeType)
{
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
{
$child = $node->childNodes->item($i);
$v = $this->domNodeToArray($child, $xmlDocument);
if (isset($child->tagName))
{
$t = $child->tagName;
if (!isset($output['value'][$t]))
{
$output['value'][$t] = [];
}
$output['value'][$t][] = $v;
}
else if ($v || $v === '0')
{
$output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
}
}
if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
{
$output = ['value' => $output['value']];
}
if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
{
$output = ['attributes' => [], 'value' => $output['value']];
}
if ($node->attributes->length)
{
$a = [];
foreach ($node->attributes as $attrName => $attrNode)
{
$a[$attrName] = (string)$attrNode->value;
}
$output['attributes'] = $a;
}
else
{
$output['attributes'] = [];
}
if (isset($output['value']) && is_array($output['value']))
{
foreach ($output['value'] as $t => $v)
{
if (is_array($v) && count($v) == 1 && $t != 'attributes')
{
$output['value'][$t] = $v[0];
}
}
}
break;
}
return $output;
}
Here is some example XML:
<?xml version="1.0" encoding="UTF-8"?>
<characters>
<character>
<name2>Sno</name2>
<friend-of>Pep</friend-of>
<since>1950-10-04</since>
<qualification>extroverted beagle</qualification>
</character>
<character>
<name2>Pep</name2>
<friend-of>Sno</friend-of>
<since>1966-08-22</since>
<qualification>bold, brash and tomboyish</qualification>
</character>
</characters>
Running the method and passing that XML as its parameter, will result with this array:
array:1 [▼
"characters" => array:2 [▼
"value" => array:1 [▼
"character" => array:2 [▼
0 => array:2 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"attributes" => []
"value" => "Sno"
]
"friend-of" => array:2 [▼
"attributes" => []
"value" => "Pep"
]
"since" => array:2 [▼
"attributes" => []
"value" => "1950-10-04"
]
"qualification" => array:2 [▼
"attributes" => []
"value" => "extroverted beagle"
]
]
"attributes" => []
]
1 => array:2 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"attributes" => []
"value" => "Pep"
]
"friend-of" => array:2 [▼
"attributes" => []
"value" => "Sno"
]
"since" => array:2 [▼
"attributes" => []
"value" => "1966-08-22"
]
"qualification" => array:2 [▼
"attributes" => []
"value" => "bold, brash and tomboyish"
]
]
"attributes" => []
]
]
]
"attributes" => []
]
]
What I want it to result to is (indentation could be wrong):
array:1 [▼
"characters" => array:2 [▼
"value" => array:2 [▼
0 => [
"character" => array:1 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"attributes" => []
"value" => "Sno"
]
"friend-of" => array:2 [▼
"attributes" => []
"value" => "Pep"
]
"since" => array:2 [▼
"attributes" => []
"value" => "1950-10-04"
]
"qualification" => array:2 [▼
"attributes" => []
"value" => "extroverted beagle"
]
]
"attributes" => []
]
]
]
1 => array:2 [▼
"character" => array:1 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"attributes" => []
"value" => "Pep"
]
"friend-of" => array:2 [▼
"attributes" => []
"value" => "Sno"
]
"since" => array:2 [▼
"attributes" => []
"value" => "1966-08-22"
]
"qualification" => array:2 [▼
"attributes" => []
"value" => "bold, brash and tomboyish"
]
]
"attributes" => []
]
]
]
]
"attributes" => []
]
]
So basically, I want the characters
key's value
key to be an array of two items, which basically includes the 2 character
keys. This is only to happen if there are many of the same element on the same branch. The way it currently is, where the character
key is an array with 2 elements doesn't work in my situation.
Altering the method above to reflect my needs hasn't been possible for me yet and I'm not sure what kind of approach I should take. Altering an array like this from a DOMDocument
instance seems quite complicated.
Upvotes: 1
Views: 148
Reputation: 14921
I've done some changes to your function but I'm not sure if this is what you need.
private function domNodeToArray($node, $xmlDocument)
{
$output = ['value' => [], 'attributes' => []];
switch ($node->nodeType) {
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
$child = $node->childNodes->item($i);
$v = $this->domNodeToArray($child, $xmlDocument);
if (isset($child->tagName)) {
$t = $child->tagName;
if (isset($output['value'][$t])) {
$output['value'][] = [$t => $output['value'][$t]];
$output['value'][] = [$t => $v];
unset($output['value'][$t]);
} else {
$output['value'][$t] = $v;
}
} elseif (($v && is_string($v)) || $v === '0') {
$output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
}
}
if ($node->attributes->length) {
foreach ($node->attributes as $attrName => $attrNode) {
$output['attributes'][$attrName] = (string) $attrNode->value;
}
}
break;
}
return $output;
}
array:1 [▼
"characters" => array:2 [▼
"value" => array:2 [▼
0 => array:1 [▼
"character" => array:2 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"value" => "Sno"
"attributes" => []
]
"friend-of" => array:2 [▼
"value" => "Pep"
"attributes" => []
]
"since" => array:2 [▼
"value" => "1950-10-04"
"attributes" => []
]
"qualification" => array:2 [▼
"value" => "extroverted beagle"
"attributes" => []
]
]
"attributes" => []
]
]
1 => array:1 [▼
"character" => array:2 [▼
"value" => array:4 [▼
"name2" => array:2 [▼
"value" => "Pep"
"attributes" => []
]
"friend-of" => array:2 [▼
"value" => "Sno"
"attributes" => []
]
"since" => array:2 [▼
"value" => "1966-08-22"
"attributes" => []
]
"qualification" => array:2 [▼
"value" => "bold, brash and tomboyish"
"attributes" => []
]
]
"attributes" => []
]
]
]
"attributes" => []
]
]
Upvotes: 1
Reputation: 57121
The problem is when to add in a new level and when to carry on with just adding the data. I've changed this logic, adding comments to the code to help understand what happens and when...
private function domNodeToArray($node, $xmlDocument)
{
$output = [];
switch ($node->nodeType)
{
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
{
$child = $node->childNodes->item($i);
$v = $this->domNodeToArray($child, $xmlDocument);
if (isset($child->tagName))
{
$t = $child->tagName;
// if (!isset($output['value'][$t]))
// {
// $output['value'][$t] = [];
// }
// If the element already exists
if (isset($output['value'][$t]))
{
// Copy the existing value to new level
$output['value'][] = [$t => $output['value'][$t]];
// Add in new value
$output['value'][] = [$t => $v];
// Remove old element
unset($output['value'][$t]);
}
// If this has already been added at a new level
elseif ( isset($output['value'][0][$t]))
{
// Add it to existing extra level
$output['value'][] = [$t => $v];
}
else {
$output['value'][$t] = $v;
}
}
else if ($v || $v === '0')
{
$output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
}
}
if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
{
$output = ['value' => $output['value']];
}
if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
{
$output = ['attributes' => [], 'value' => $output['value']];
}
if ($node->attributes->length)
{
$a = [];
foreach ($node->attributes as $attrName => $attrNode)
{
$a[$attrName] = (string)$attrNode->value;
}
$output['attributes'] = $a;
}
else
{
$output['attributes'] = [];
}
break;
}
return $output;
}
I've tried it with...
<?xml version="1.0" encoding="UTF-8"?>
<characters>
<character>
<name2>Sno</name2>
<friend-of>Pep</friend-of>
<since>1950-10-04</since>
<qualification>extroverted beagle</qualification>
</character>
<character>
<name2>Pep</name2>
<friend-of>Sno</friend-of>
<since>1966-08-22</since>
<qualification>bold, brash and tomboyish</qualification>
</character>
<character>
<name2>Pep2</name2>
<friend-of>Sno</friend-of>
<since>1966-08-23</since>
<qualification>boldish, brashish and tomboyish</qualification>
</character>
</characters>
to check that the <character>
elements are all added to the right level.
Upvotes: 1