Reputation: 928
I need to make XML handling routine which would remove certain XML tags from XML document given to the routine. The XML document is not fixed but it's known that it is not using any namespaces.
The routine will have two lists of XML tag names:
Exclude is more dominant i.e. if same tag is in both lists then that tag should not be picked. If parent tag is excluded then child tags should be excluded also.
I have seen great examples and answers accross the web but haven't found fully working solution in single XSLT for my issue.
This solution seems very clear and reasonable but would it be possible to have "BlackList" also in the same XSLT?: XSLT - How to keep only wanted elements from XML
EDIT: Exclude and include lists are independent from each others. I.e. exclude list does not contain all tags which are not in include list and vice versa.
EDIT2: Simplified process needed: XML -> remove exclude tags -> remove other than include tags.
EDIT3: Fixed link.
EDIT4: Venn Diagrams with all use cases (A section is always wanted):
Upvotes: 2
Views: 1679
Reputation: 117018
--- answer modified due to clarifications ---
The following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="http://example.com/my">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<my:whitelist>
<item>America</item>
<item>USA</item>
<item>California</item>
<item>LosAngeles</item>
<item>SanFranciso</item>
<item>Mexico</item>
<item>Tijuana</item>
</my:whitelist>
<my:blacklist>
<item>Mexico</item>
</my:blacklist>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(name()=document('')/*/my:whitelist/item) or name()=document('')/*/my:blacklist/item]"/>
</xsl:stylesheet>
applied to the following input:
XML
<America>
<USA>
<NewYork>
<NewYork>no</NewYork>
<Albany>yes</Albany>
</NewYork>
<California>
<LosAngeles>no</LosAngeles>
<SanFranciso>no</SanFranciso>
</California>
</USA>
<Canada>
<Vancouver>no</Vancouver>
<Montreal>yes</Montreal>
</Canada>
<Mexico>
<Tijuana>no</Tijuana>
</Mexico>
</America>
will return:
<?xml version="1.0" encoding="UTF-8"?>
<America>
<USA>
<California>
<LosAngeles>no</LosAngeles>
<SanFranciso>no</SanFranciso>
</California>
</USA>
</America>
Of course, this only make sense if the two lists are allowed to overlap - i.e. when the blacklist overrides the whitelist.
If your processor cannot resolve the document()
function referring back to the stylesheet itself, try the following alternative:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="whitelist">
<item>America</item>
<item>USA</item>
<item>California</item>
<item>LosAngeles</item>
<item>SanFranciso</item>
<item>Mexico</item>
<item>Tijuana</item>
</xsl:variable>
<xsl:variable name="blacklist">
<item>Mexico</item>
</xsl:variable>
<xsl:template match="*">
<xsl:if test="name()=exsl:node-set($whitelist)/item and not (name()=exsl:node-set($blacklist)/item)">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Demo: http://xsltransform.net/3NSSEvk
Upvotes: 2