Reputation: 305
I am looking to try and change a XML output so that the element structure changes and some CDATA becomes an attribute
rather than an <element>
Given the XML stack.xml
:
<root>
<item>
<name>name</name>
<type>Type</type>
<dateMade>Datemade</dateMade>
<desc>Desc</desc>
</item>
....(more Items)...
</root>
I would like to change the XML output to stacksaved.xml
:
<root>
<item>
<name>name</name>
<Itemtype type="Type">
<Itemdate dateMade="Datemade">
<desc>Desc</desc>
</Itemdate>
<Itemtype>
</item>
....(next item)....
</root>
So far my PHP DOM looks like this:
<?php
//create and load
$doc = new DOMDocument();
$doc->load('stack.xml');
$types=$doc->getElementsByTagName("type");
foreach ($types as $type)
{
$attribute=$doc->getElementsByTagName("type");
$doc->getElementsByTagName("type").setAttribute("$attribute");
}
$doc->save('stacksaved.xml'); //save the final results into xml file
?>
I keep getting the error: Fatal error: Call to undefined function setAttribute() and the document is not saved or edited in anyway. I am really new to DOM/PHP and would greatly appreciate any advice!
How would I go about changing the child structure and the element to the desired output?
Thanks as always for the read!
EDIT: Parfait gave a great explanation and showed the great power of XSLT but I am trying to get this to run using pure php only as a learning exercise for php/DOM. Can anyone help with converting this using PHP only?
Upvotes: 0
Views: 168
Reputation: 107622
For a pure PHP DOM solution, consider creating a new DOMDocument
iterating over values of old document using createElement
, appendChild
, and setAttribute
methods. The multiple nested if
logic is needed to check existence of a node before creating elements with items' node values, otherwise Undefined Warnings are raised.
$doc = new DOMDocument();
$doc->load('stack.xml');
// INITIALIZE NEW DOM DOCUMENT
$newdoc = new DOMDocument('1.0', 'UTF-8');
$newdoc->preserveWhiteSpace = false;
$newdoc->formatOutput = true;
// APPEND ROOT
$root= $newdoc->appendChild($newdoc->createElement("root"));
$items=$doc->getElementsByTagName("item");
// ITERATIVELY APPEND ITEM AND CHILDREN
foreach($items as $item){
$ItemNode = $newdoc->createElement("item");
$root->appendChild($ItemNode);
if (count($item->getElementsByTagName("name")->item(0)) > 0) {
$ItemNode->appendChild($newdoc->createElement('name', $item->getElementsByTagName("name")->item(0)->nodeValue));
}
if (count($item->getElementsByTagName("type")->item(0)) > 0) {
$ItemtypeNode = $ItemNode->appendChild($newdoc->createElement('Itemtype'));
$ItemtypeNode->setAttribute("type", $item->getElementsByTagName("type")->item(0)->nodeValue);
if (count($item->getElementsByTagName("dateMade")->item(0)) > 0) {
$ItemdateNode = $ItemtypeNode->appendChild($newdoc->createElement('Itemdate'));
$ItemdateNode->setAttribute("dateMade", $item->getElementsByTagName("dateMade")->item(0)->nodeValue);
if (count($item->getElementsByTagName("desc")->item(0)) > 0) {
$ItemdateNode->appendChild($newdoc->createElement('desc', $item->getElementsByTagName("desc")->item(0)->nodeValue));
}
}
}
}
// ECHO AND SAVE NEW DOC TREE
echo $newdoc->saveXML();
$newdoc->save($cd.'/ItemTypeDateMade_dom.xml');
Output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
<name>name</name>
<Itemtype type="Type">
<Itemdate dateMade="Datemade">
<desc>Desc</desc>
</Itemdate>
</Itemtype>
</item>
</root>
As mentioned in previous answer, here requires for
and nested if
that would not be required with XSLT. In fact, using microtime
, we can compare script runtimes. Below enlargen stack.xml
:
$time_start = microtime(true);
...
echo "Total execution time in seconds: " . (microtime(true) - $time_start) ."\n";
At 1,000 node lines, XSLT proves faster than DOM:
# XSLT VERSION
Total execution time in seconds: 0.0062189102172852
# DOM VERSION
Total execution time in seconds: 0.013695955276489
At 2,000 node lines, XSLT still remains about 2X faster than DOM:
# XSLT VERSION
Total execution time in seconds: 0.014697074890137
# DOM VERSION
Total execution time in seconds: 0.031282186508179
At 10,000 node lines, XSLT now becomes slightly faster than DOM. Reason for DOM's catch up might be due to the memory inefficiency XSLT 1.0 maintains for larger files, especially (> 100 MB). But arguably here for this use case, the XSLT approach is an easier PHP script to maintain and read:
# XSLT VERSION
Total execution time in seconds: 0.27568817138672
# DOM VERSION
Total execution time in seconds: 0.37149095535278
Upvotes: 1
Reputation: 107622
Consider XSLT, the special-purpose, declarative language designed to transform XML documents. PHP can run XSLT 1.0 scripts with the php-xsl
extension (be sure to enable it in .ini file). With this approach, you avoid any need of foreach
looping or if
logic.
XSLT (save as .xsl file)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:template match="root">
<xsl:copy>
<xsl:apply-templates select="item"/>
</xsl:copy>
</xsl:template>
<xsl:template match="item">
<xsl:copy>
<xsl:copy-of select="name"/>
<Itemtype type="{type}">
<Itemdate dateMade="{dateMade}">
<xsl:copy-of select="desc"/>
</Itemdate>
</Itemtype>
</xsl:copy>
</xsl:template>
</xsl:transform>
PHP
$doc = new DOMDocument();
$doc->load('stack.xml');
$xsl = new DOMDocument;
$xsl->load('XSLTScript.xsl');
// CONFIGURE TRANSFORMER
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// PROCESS TRANSFORMATION
$newXML = $proc->transformToXML($doc);
// ECHO STRING OUTPUT
echo $newXML;
// SAVE OUTPUT TO FILE
file_put_contents('Output.xml', $newXML);
Output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
<name>name</name>
<Itemtype type="Type">
<Itemdate dateMade="Datemade">
<desc>Desc</desc>
</Itemdate>
</Itemtype>
</item>
</root>
Upvotes: 1