Reputation: 13
I have a quite simple xml which I want to rearrange using xmlstarlet.
Example:
<myXml description="example 1">
<!-- Comment XXX -->
<randomNodeX>
<randomSubNode1>value1</randomSubNode1>
<randomSubNode2>value2</randomSubNode2>
</randomNodeX>
<!-- Comment YYY1 -->
<!-- Comment YYY2 -->
<randomNodeY attribute1="value3" attribute2="value4"/>
<!-- Comment ZZZ -->
<randomNodeZ attribute1="value5" attribute0="value6">
<randomSubNode3 attribute3="value7" attribute4="value8"/>
</randomNodeZ>
<!-- Comment for node1 first occurrence -->
<node1 attribute1="value9" attribute5="value10" attribute6="value11"/>
<!-- Comment for node2 first occurrence -->
<node2 attribute1="value12" attribute7="value13" attribute8="value14">
<subNode21 attributeX="value15"/>
<subNode22 attributeY="value16" attributeZ="value17"/>
</node2>
<!-- Comment for node3 first occurrence -->
<node3 attribute1="value18" attribute9="value19">
<subNode31 attributeW="value20"/>
</node3>
<!-- Comment for node1 second occurrence -->
<node1 attribute1="value21" attribute5="value22" attribute6="value23"/>
<!-- Comment for node3 second occurrence -->
<node3 attribute1="value24" attribute9="value25">
<subNode31 attributeW="value26"/>
</node3>
<!-- Comment for node2 second occurrence -->
<node2 attribute1="value27" attribute7="value28" attribute8="value29">
<subNode21 attributeX="value30"/>
<subNode22 attributeY="value31" attributeZ="value32"/>
</node2>
</myXml>
I want to rearrange the xml so that all node1, node2 and node3 elements appear together with their respective comments. Besides, I want to keep the rest of the document and comments without having to know which tags are present. I mean, there can be other tags in the xml apart from node1, node2 and node3 which I want to keep at the begining of the document (comments included).
Expected result:
<myXml description="example 1">
<!-- Comment XXX -->
<randomNodeX>
<randomSubNode1>value1</randomSubNode1>
<randomSubNode2>value2</randomSubNode2>
</randomNodeX>
<!-- Comment YYY1 -->
<!-- Comment YYY2 -->
<randomNodeY attribute1="value3" attribute2="value4"/>
<!-- Comment ZZZ -->
<randomNodeZ attribute1="value5" attribute0="value6">
<randomSubNode3 attribute3="value7" attribute4="value8"/>
</randomNodeZ>
<!-- Comment for node1 first occurrence -->
<node1 attribute1="value9" attribute5="value10" attribute6="value11"/>
<!-- Comment for node1 second occurrence -->
<node1 attribute1="value21" attribute5="value22" attribute6="value23"/>
<!-- Comment for node2 first occurrence -->
<node2 attribute1="value12" attribute7="value13" attribute8="value14">
<subNode21 attributeX="value15"/>
<subNode22 attributeY="value16" attributeZ="value17"/>
</node2>
<!-- Comment for node2 second occurrence -->
<node2 attribute1="value27" attribute7="value28" attribute8="value29">
<subNode21 attributeX="value30"/>
<subNode22 attributeY="value31" attributeZ="value32"/>
</node2>
<!-- Comment for node3 first occurrence -->
<node3 attribute1="value18" attribute9="value19">
<subNode31 attributeW="value20"/>
</node3>
<!-- Comment for node3 second occurrence -->
<node3 attribute1="value24" attribute9="value25">
<subNode31 attributeW="value26"/>
</node3>
</myXml>
For now I have done it using this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()[not(self::node1|self::node2|self::node3|self::comment())]"/>
<xsl:apply-templates select="node1"/>
<xsl:apply-templates select="node2"/>
<xsl:apply-templates select="node3"/>
</xsl:copy>
</xsl:template>
<xsl:template match="randomNodeX|randomNodeY|randomNodeZ|node1|node2|node3">
<xsl:apply-templates select="preceding-sibling::comment()[generate-id(following-sibling::*[1])=generate-id(current())]"/>
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
The problem is that I have to specify all random tags (randomNodeX, randomNodeY, ...) which are present in the xml.
Is there a way to do this without knowing the tags present apart from node1, node2 and node3 ???
Upvotes: 1
Views: 87
Reputation: 117073
I'd do it this way:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="my-comments" match="comment()" use="generate-id(following-sibling::*[1])" />
<xsl:template match="/myXml">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates select="*[not(self::node1 or self::node2 or self::node3)]"/>
<xsl:apply-templates select="node1"/>
<xsl:apply-templates select="node2"/>
<xsl:apply-templates select="node3"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:copy-of select="key('my-comments', generate-id())"/>
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
In XSLT 2.0 this could be streamlined to:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="my-comments" match="comment()" use="generate-id(following-sibling::*[1])" />
<xsl:template match="/myXml">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:variable name="my-nodes" select="node1, node2, node3" />
<xsl:apply-templates select="* except $my-nodes, $my-nodes"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:copy-of select="key('my-comments', generate-id())"/>
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Upvotes: 2
Reputation: 163458
XSLT 2.0+ solution:
I would start with a grouping operation that groups elements with their "associated" comments, so that
<!-- Comment ZZZ -->
<randomNodeZ attribute1="value5" attribute0="value6">
<randomSubNode3 attribute3="value7" attribute4="value8"/>
</randomNodeZ>
becomes
<group>
<!-- Comment ZZZ -->
<randomNodeZ attribute1="value5" attribute0="value6">
<randomSubNode3 attribute3="value7" attribute4="value8"/>
</randomNodeZ>
</group>
and then in phase 2, group the groups according to the contained element name (dropping the group
wrapper at the same time).
In your example every element is preceded by one or more "associated" comments, but can we rely on that always being the case? To be a bit more tolerant of input variations, we could say that a group starts on any comment or element that isn't immediately preceded by a comment. If we assume whitespace has been stripped using xsl:strip-space, we can do the first grouping with
<xsl:for-each-group select="child::node()"
group-starting-with="(comment()|*)[not(preceding-sibling::*[1][self::comment()]">
<group><xsl:copy-of select="current-group()"/></group>
</xsl:for-each-group>
and the second is
<xsl:for-each-group select="group" group-by="name(*[1])">
<xsl:copy-of select="current-group()/child::node()"/>
</xsl:for-each-group>
but you might want to re-inject some whitespace.
Upvotes: 0
Reputation: 167696
It is kind of an except
operation in XPath 2 available as the except
operator, in XPath 1 in XSLT 1 perhaps expressed as
<xsl:template match="/*">
<xsl:copy>
<xsl:variable name="nodes" select="node1 | node2 | node3"/>
<xsl:variable name="trailers" select="$nodes | $nodes/preceding-sibling::comment()[1]"/>
<xsl:apply-templates select="node()[count(. | $trailers) > count($trailers)]"/>
<xsl:apply-templates select="$trailers"/>
</xsl:copy>
</xsl:template>
This assumes all your node1
, node2
and node3
elements are having exactly one preceding sibling comment node.
I am not quite sure, however, why you didn't just use match="/*/*"
instead of match="randomNodeX|randomNodeY|randomNodeZ|node1|node2|node3"
.
Upvotes: 0