Reputation: 41
I have a huge XML file, that look like this (but is much larger):
<?xml version="1.0" encoding="UTF-8"?>
<suite id="1" name="SuiteName">
<displayNameKey>something</displayNameKey>
<displayName>something</displayName>
<application id="2" name="Manager">
<displayNameKey>appName</displayNameKey>
<displayName>appName</displayName>
<category id="12" name="navigation">
<displayNameKey>managerNavigation</displayNameKey>
<displayName>managerNavigation</displayName>
<description>mgr_navigation</description>
<property id="13" name="httpPort" type="integer_property" width="40">
<displayNameKey>managerHttpPort</displayNameKey>
<displayName>managerHttpPort</displayName>
<value>80</value>
</property>
<property id="14" name="httpsPort" type="integer_property" width="40">
<displayNameKey>managerHttpsPort</displayNameKey>
<displayName>managerHttpsPort</displayName>
<value>443</value>
</property>
<property id="15" name="welcomePageURI" type="url_property" width="40" hidden="true">
<displayNameKey>welcomePageURI</displayNameKey>
<displayName>welcomePageURI</displayName>
<value>jsp/index.jsp</value>
</property>
<property id="16" name="serverURL" type="url_property" width="40">
<displayNameKey>serverURL</displayNameKey>
<displayName>serverURL</displayName>
<value>somevalue</value>
</property>
</category>
<category id="17" name="datafiltering">
<displayNameKey>managerDataFiltering</displayNameKey>
<displayName>managerDataFiltering</displayName>
<description>mgr_data_filtering</description>
<property id="18" name="defaultTableName" type="string_property" width="40">
<displayNameKey>defaultTableName</displayNameKey>
<displayName>defaultTableName</displayName>
</property>
<property id="19" name="defaultAudienceName" type="string_property" width="40">
<displayNameKey>defaultAudienceName</displayNameKey>
<displayName>defaultAudienceName</displayName>
</property>
</category>
</application>
</suite>
What I need to do is generate an XPath expression for each property but not using positions or IDs, but name attribute. That is, for the file above the desired output is similar to:
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpsPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="welcomePageURI"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="serverURL"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultTableName"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultAudienceName"]
All XPath generators I found only generate XPath using name attribute or position, such as /suite[0]/application[0]/category[1]/...
Can you please recommend me a way how to generate XPaths for all properties in my file? And one more thing - the structure is variable - that is there can be 0 to N nested categories, such as
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="cat1"]/category[@name="cat2"]/category[@name="cat3"]/property[@name="property1"]
/suite[@name="SuiteName"]/application[@name="Manager"]/property[@name="property2"]
Upvotes: 4
Views: 1161
Reputation: 243469
This is probably the shortest and simplest (no named templates, no explicit conditional instructions at all, no xsl:for-each
and no use of //
) XSLT transformation that implements the wanted processing:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="property">
<xsl:apply-templates select="ancestor::*" mode="build"/>
<xsl:value-of select=
"concat('/property[@name="', @name, '"]')"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="*" mode="build">
<xsl:value-of select=
"concat('/',name(),'[@name="', @name, '"]')"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<?xml version="1.0" encoding="UTF-8"?>
<suite id="1" name="SuiteName">
<displayNameKey>something</displayNameKey>
<displayName>something</displayName>
<application id="2" name="Manager">
<displayNameKey>appName</displayNameKey>
<displayName>appName</displayName>
<category id="12" name="navigation">
<displayNameKey>managerNavigation</displayNameKey>
<displayName>managerNavigation</displayName>
<description>mgr_navigation</description>
<property id="13" name="httpPort" type="integer_property" width="40">
<displayNameKey>managerHttpPort</displayNameKey>
<displayName>managerHttpPort</displayName>
<value>80</value>
</property>
<property id="14" name="httpsPort" type="integer_property" width="40">
<displayNameKey>managerHttpsPort</displayNameKey>
<displayName>managerHttpsPort</displayName>
<value>443</value>
</property>
<property id="15" name="welcomePageURI" type="url_property" width="40" hidden="true">
<displayNameKey>welcomePageURI</displayNameKey>
<displayName>welcomePageURI</displayName>
<value>jsp/index.jsp</value>
</property>
<property id="16" name="serverURL" type="url_property" width="40">
<displayNameKey>serverURL</displayNameKey>
<displayName>serverURL</displayName>
<value>somevalue</value>
</property>
</category>
<category id="17" name="datafiltering">
<displayNameKey>managerDataFiltering</displayNameKey>
<displayName>managerDataFiltering</displayName>
<description>mgr_data_filtering</description>
<property id="18" name="defaultTableName" type="string_property" width="40">
<displayNameKey>defaultTableName</displayNameKey>
<displayName>defaultTableName</displayName>
</property>
<property id="19" name="defaultAudienceName" type="string_property" width="40">
<displayNameKey>defaultAudienceName</displayNameKey>
<displayName>defaultAudienceName</displayName>
</property>
</category>
</application>
</suite>
the wanted, correct result is produced:
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpsPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="welcomePageURI"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="serverURL"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultTableName"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultAudienceName"]
Upvotes: 1
Reputation: 2610
You could do it in php like this:
<?php
$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<suite id="1" name="SuiteName">
<displayNameKey>something</displayNameKey>
<displayName>something</displayName>
<application id="2" name="Manager">
<displayNameKey>appName</displayNameKey>
<displayName>appName</displayName>
<category id="12" name="navigation">
<displayNameKey>managerNavigation</displayNameKey>
<displayName>managerNavigation</displayName>
<description>mgr_navigation</description>
<property id="13" name="httpPort" type="integer_property" width="40">
<displayNameKey>managerHttpPort</displayNameKey>
<displayName>managerHttpPort</displayName>
<value>80</value>
</property>
<property id="14" name="httpsPort" type="integer_property" width="40">
<displayNameKey>managerHttpsPort</displayNameKey>
<displayName>managerHttpsPort</displayName>
<value>443</value>
</property>
<property id="15" name="welcomePageURI" type="url_property" width="40" hidden="true">
<displayNameKey>welcomePageURI</displayNameKey>
<displayName>welcomePageURI</displayName>
<value>jsp/index.jsp</value>
</property>
<property id="16" name="serverURL" type="url_property" width="40">
<displayNameKey>serverURL</displayNameKey>
<displayName>serverURL</displayName>
<value>somevalue</value>
</property>
</category>
<category id="17" name="datafiltering">
<displayNameKey>managerDataFiltering</displayNameKey>
<displayName>managerDataFiltering</displayName>
<description>mgr_data_filtering</description>
<property id="18" name="defaultTableName" type="string_property" width="40">
<displayNameKey>defaultTableName</displayNameKey>
<displayName>defaultTableName</displayName>
</property>
<property id="19" name="defaultAudienceName" type="string_property" width="40">
<displayNameKey>defaultAudienceName</displayNameKey>
<displayName>defaultAudienceName</displayName>
</property>
</category>
</application>
</suite>
XML;
function genXpath($xml, $att, $current = null)
{
if($current == null) $current = '/*';
$new = $current.'[@'.$att.']';
$result = $xml->xpath($new);
if($current[strlen($current) - 1] == '*')
{
$current = substr($current, 0, strlen($current) - 1);
}
if(count($result))
{
foreach($result as $node)
{
$prev = $current;
$current .= $node->getName().'[@'.$att.'="'.$node->attributes()->$att.'"]/*';
genXpath($xml, $att, $current);
$current = $prev;
}
}
else
{
$current = substr($current, 0, strlen($current) - 1);
echo $current.'<br />';
}
}
// how to use
$xml = new SimpleXMLElement($xml);
genXpath($xml, "name");
?>
It outputs something like this:
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="httpsPort"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="welcomePageURI"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="navigation"]/property[@name="serverURL"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultTableName"]
/suite[@name="SuiteName"]/application[@name="Manager"]/category[@name="datafiltering"]/property[@name="defaultAudienceName"]
I hope it helps. And also you can set the attribute name you want.
The function itself and the use of it is:
<?php
function genXpath($xml, $att, $current = null)
{
if($current == null) $current = '/*';
$new = $current.'[@'.$att.']';
$result = $xml->xpath($new);
if($current[strlen($current) - 1] == '*')
{
$current = substr($current, 0, strlen($current) - 1);
}
if(count($result))
{
foreach($result as $node)
{
$prev = $current;
$current .= $node->getName().'[@'.$att.'="'.$node->attributes()->$att.'"]/*';
genXpath($xml, $att, $current);
$current = $prev;
}
}
else
{
$current = substr($current, 0, strlen($current) - 1);
echo $current.'<br />';
}
}
// how to use
$xml = "your xml string"; // you can read it from a file
$xml = new SimpleXMLElement($xml);
genXpath($xml, "name");
The algorithm is what's important here, you can easily port it in any other programming language. All that is needed is the support for xpath, and to change the way you should obtain information from the result given by xpath query.
Best regards,
blind
Upvotes: 1
Reputation: 34650
I would do something like this:
<xsl:transform version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/">
<xsl:for-each select="//property">
<xsl:call-template name="add-parent-xpath"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
<xsl:template name="add-parent-xpath">
<xsl:if test="name(.) != 'suite'">
<xsl:for-each select="..">
<xsl:call-template name="add-parent-xpath" />
</xsl:for-each>
</xsl:if>
<xsl:value-of select="concat('/', name(.), '[@name="', @name, '"]')"/>
</xsl:template>
</xsl:transform>
It starts by selecting each property node, and then recursively climbs all the away back up to the suite node. As the recursion unravels, it emits the xpath to select that node, so you get the xpath for suite, then the next level, and so on, all the way back down to the property.
Upvotes: 0