Reputation: 169
I stumble upon one problem again today. I have xml with 1000 tags named book. Every tag has its own attribute, but some attributes are duplicated.
So i have XML:
... some other not duplicated attribute data ...
<book attribute="attr1"></book>
<book attribute="attr1"></book>
<book attribute="attr1"></book>
... some other not duplicated attribute data ...
<book attribute="attr2"></book>
<book attribute="attr2"></book>
<book attribute="attr2"></book>
... some other not duplicated attribute data ...
Is there a way with xslt so i can have attributes that are in xml more than once renamed:
... some other not duplicated attribute data...
<book attribute="attr1-1"></book>
<book attribute="attr1-2"></book>
<book attribute="attr1-3"></book>
... some other not duplicated attribute data ...
<book attribute="attr2-1"></book>
<book attribute="attr2-2"></book>
<book attribute="attr2-3"></book>
... some other not duplicated attribute data ...
Hope this is possible with xslt and that none duplicated attributes stay the same? Thanks a lot for all the answers, eoglasi
Upvotes: 2
Views: 710
Reputation: 22617
Input XML:
<?xml version="1.0" encoding="utf-8"?>
<test>
<book attribute="attr1"></book>
<book attribute="attr1"></book>
<book attribute="attr1"></book>
<book attribute="attr2"></book>
<book attribute="attr2"></book>
<book attribute="attr2"></book>
<book attribute="attr5"></book>
</test>
The following stylesheet should do the job. Essentially, it checks whether a group (grouping by the attribute which is called "attribute") consists of 1 item only (i.e. if the attribute value is unique).
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="test">
<xsl:copy>
<xsl:for-each-group select="book" group-by="@attribute">
<xsl:choose>
<xsl:when test="count(current-group()) = 1">
<xsl:element name="book">
<xsl:attribute name="attribute">
<xsl:value-of select="@attribute"/>
</xsl:attribute>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:for-each select="current-group()">
<xsl:element name="book">
<xsl:attribute name="attribute">
<xsl:value-of select="concat(current-grouping-key(), '-', position())"/>
</xsl:attribute>
</xsl:element>
</xsl:for-each>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
You get the following output (I have included 1 unique attribute value in the input file):
<test>
<book attribute="attr1-1"/>
<book attribute="attr1-2"/>
<book attribute="attr1-3"/>
<book attribute="attr2-1"/>
<book attribute="attr2-2"/>
<book attribute="attr2-3"/>
<book attribute="attr5"/>
</test>
EDIT: Note that this will reorder non-adjacent book elements with the same attribute value.
Upvotes: 1
Reputation: 122394
One way to check whether an attribute is a duplicate is to define a key to look up book
elements by their attribute
value and then have special handling for the case where the key lookup gives you more than one result:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:key name="bookByAttribute" match="book" use="@attribute" />
<xsl:template match="@*|node()">
<xsl:copy><xsl:apply-templates select="@*|node()" /></xsl:copy>
</xsl:template>
<xsl:template match="book/@attribute[key('bookByAttribute', .)[2]]">
<xsl:attribute name="attribute">
<!-- logic to create a de-duplicated value -->
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Books whose attribute
is not repeated will not be affected by this template. The simplest way to generate de-duplicated values would be to use generate-id()
directly, as Vincent suggests, but if you really need sequential numbers (and you can guarantee that this won't itself cause duplication, e.g. if the original document already has both foo
and foo-1
) then you could use a trick like
<xsl:template match="book/@attribute[key('bookByAttribute', .)[2]]">
<xsl:variable name="myId" select="generate-id(..)" />
<xsl:attribute name="attribute">
<xsl:value-of select="." />
<xsl:text>-</xsl:text>
<xsl:for-each select="key('bookByAttribute', .)">
<xsl:if test="generate-id() = $myId">
<xsl:value-of select="position()" />
</xsl:if>
</xsl:for-each>
</xsl:attribute>
</xsl:template>
The for-each
is essentially finding the position in document order of the current book within the set of nodes that share the same attribute value.
Upvotes: 3
Reputation: 2998
If you're not bound with a specific pattern for your attributes, there's a dedicated function to create unique id for each specific node-set in the input file: generate-id
. In your case, you may use it like that:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="book/@attribute">
<xsl:attribute name="attribute">
<xsl:value-of select="concat(., '-', generate-id())"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
for this XML :
<test>
... some other not duplicated attribute data ...
<book attribute="attr1"></book>
<book attribute="attr1"></book>
<book attribute="attr1"></book>
... some other not duplicated attribute data ...
<book attribute="attr2"></book>
<book attribute="attr2"></book>
<book attribute="attr2"></book>
... some other not duplicated attribute data ...
</test>
you get something like:
<test>
... some other not duplicated attribute data ...
<book attribute="attr1-d0e3_a0"/>
<book attribute="attr1-d0e5_a1"/>
<book attribute="attr1-d0e7_a2"/>
... some other not duplicated attribute data ...
<book attribute="attr2-d0e9_a3"/>
<book attribute="attr2-d0e11_a4"/>
<book attribute="attr2-d0e13_a5"/>
... some other not duplicated attribute data ...
</test>
Upvotes: 1