Hugh Guiney
Hugh Guiney

Reputation: 1369

XQuery: Merge Nodes of Same Name

How would I take all element nodes of the same name, and combine them together into one that retains the child elements of each?

Example input:

<topic>
  <title />
  <language />
  <more-info>
    <itunes />
  </more-info>
  <more-info>
    <imdb />
  </more-info>
  <more-info>
    <netflix />
  </more-info>
</topic>

Example output (all of the more-infos are collapsed into a single element):

<topic>
  <title />
  <language />
  <more-info>
    <itunes />
    <imdb />
    <netflix />
  </more-info>
</topic>

Edit: I am looking for a way to do this without knowing which node names reoccur. So, with the example above, I could not use a script that only targeted more-info, as there may be other elements that also need to have the same process applied to them.

Upvotes: 0

Views: 1596

Answers (3)

programaths
programaths

Reputation: 891

I came with that:

for $n in $nodes/node()
let $lname := local-name($n)
group by $lname
return element {$lname} {
  $n/node()
}

Where $nodes contains the input document.

It uses a group by which will bind the $n variable to the list of grouped nodes. So, the expression $n/node() represent a sequence of node.

To make it recursive, we have to declare a function and call it:

declare function local:recurse($node){
  for $n in $node/text() return $n,
  for $n in $node/element()
  let $lname := local-name($n)
  group by $lname
  return element {$lname} {
    for $m in $n return local:recurse($m)
  }
};

local:recurse($nodes)

The first line ends with a comma. It's a list concatenation. So, we output text nodes first, then element nodes with the group by sheningan explained above.

XML Input

<topic>
  <title>Test</title>
  <language />
  <more-info>
    <itunes>
      <playlist>
        <item>2</item>
      </playlist>
      <playlist>
        <item>3</item>
      </playlist>
    </itunes>
  </more-info>
  <more-info>
    <imdb>Imdb info</imdb>
  </more-info>
  <more-info>
    <netflix>Netflix info</netflix>
  </more-info>
</topic>

XML Output

<title>Test</title>
<language/>
<more-info>
  <itunes>
    <playlist>
      <item>2</item>
      <item>3</item>
    </playlist>
  </itunes>
  <imdb>Imdb info</imdb>
  <netflix>Netflix info</netflix>
</more-info>

Remarks

I have no clues why XSLT is deemed easier. Maybe the apply-templates is masquerading the recursion, making it less intimidating.

Also, the fact that the matching is declared outside the "loop" make it easier (then, has to be paired with mode for full control) compared to XQuery which required it inside the "loop".

Whatever, in this peculiar example, XQuery seems to be very appropriate.

Upvotes: 0

Daniel Haley
Daniel Haley

Reputation: 52868

This seems like a better job for XSLT if you can use it.

XML Input

<topic>
    <title />
    <language />
    <more-info>
        <itunes />
    </more-info>
    <more-info>
        <imdb />
    </more-info>
    <more-info>
        <netflix />
    </more-info>
</topic>

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/*">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:for-each-group select="*" group-by="name()">
                <xsl:copy>
                    <xsl:apply-templates select="current-group()/@*"/>
                    <xsl:apply-templates select="current-group()/*"/>
                </xsl:copy>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

XML Output

<topic>
   <title/>
   <language/>
   <more-info>
      <itunes/>
      <imdb/>
      <netflix/>
   </more-info>
</topic>

Upvotes: 0

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243489

Use:

declare option saxon:output "omit-xml-declaration=yes";
<topic>
  <title />
  <language />
  <more-info>
   {for $inf in /*/more-info/node()
     return $inf
   }
  </more-info>
</topic>

When this XQuery is applied on the provided XML document:

<topic>
  <title />
  <language />
  <more-info>
    <itunes />
  </more-info>
  <more-info>
    <imdb />
  </more-info>
  <more-info>
    <netflix />
  </more-info>
</topic>

the wanted, correct result is produced:

<topic>
   <title/>
   <language/>
   <more-info>
      <itunes/>
      <imdb/>
      <netflix/>
   </more-info>
</topic>

Upvotes: 1

Related Questions