Bernard Vander Beken
Bernard Vander Beken

Reputation: 5056

Grouping and totaling a sub-element of repeating elements in XML

How can I group and sum the input XML so duplicate Item elements are merged?

Sample input XML

<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>1</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>2</MinQty>
  </Item>
</Inventory>

Expected output XML

<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>3</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
</Inventory>

Current idea is based on this answer, using Group-Object. However I don't see how to combine this with the summing of a sub-element. Remove Duplicate XML Node Groups in Powershell

Upvotes: 1

Views: 101

Answers (2)

mklement0
mklement0

Reputation: 439228

A Group-Object solution that modifies the XML document in place:

# Parse the XML file into a DOM.
# Replace "sample.xml" with your XML file path.
($xmlDoc = [xml]::new()).Load((Convert-Path sample.xml))

# Loop over all <Item> elements.
$xmlDoc.Inventory.Item | 
  Group-Object ItemType, ItemId, Color | 
  Where-Object Count -gt 1 | 
  ForEach-Object {
    # Sum up all MinQty values and assign the sum to the first element
    # in the group.
    $_.Group[0].MinQty = [string] ($_.Group.MinQty | Measure-Object -Sum).Sum
    # Now remove all other elements in the group.
    $null = $_.Group[1..($_.Count-1)].ForEach({ $_.ParentNode.RemoveChild($_) })
  }

Note:

  • Casting the .Sum value to [string] explicitly is only required in Windows PowerShell, where only an actual string ([string] instance) is accepted on assignment to an XML element.
    PowerShell (Core) 7+, more conveniently performs stringification on demand.

  • A simple example:

     # OK in PS 7+: integer 2 is automatically converted to string.
     # In WinPS, you get the following error:
     #   Cannot set "el" because only strings can be used as 
     #   values to set XmlNode properties.'
     ($x = [xml] '<el>1</el>').el = 2 
    
     # Required in Windows PowerShell: RHS must be an actual string.
     ($x = [xml] '<el>1</el>').el = [string] 2 # simple alternative here: '2' 
    

Printing the XML text of the updated DOM, as shown below, matches the expected result in your question:

if ($IsCoreCLR) { # PowerShell (Core) 7+
  # Pretty-print
  ([System.Xml.Linq.XDocument] $xmlDoc.OuterXml).ToString()
} else { # Windows PowerShell 
  # No pretty-printing easily available, just print .OuterXml
  $xmlDoc.OuterXml
}

Upvotes: 4

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22275

XSLT based solution via grouping.

Grouping is using the Muenchian method.

Powershell

$xslt = New-Object System.Xml.Xsl.XslCompiledTransform;
$xslt.load("e:\Temp\Powershell\Process.xslt")
$xslt.Transform("e:\Temp\Powershell\Input.xml", "e:\Temp\Powershell\Output.xml")

Input XML

<?xml version="1.0"?>
<Inventory>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>37494</ItemID>
        <Color>11</Color>
        <MinQty>1</MinQty>
    </Item>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>50254</ItemID>
        <Color>11</Color>
        <MinQty>4</MinQty>
    </Item>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>37494</ItemID>
        <Color>11</Color>
        <MinQty>2</MinQty>
    </Item>
</Inventory>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>

    <xsl:key name="ItemKey" match="Item" use="concat(ItemType, '||', ItemID, '||', Color)"/>

    <xsl:template match="Inventory">
        <xsl:copy>
            <xsl:for-each select="Item[generate-id(.) = generate-id(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))[1])]">
                <xsl:sort select="ItemID" data-type="text" order="ascending"/>
                <xsl:copy>
                    <xsl:copy-of select="ItemType"/>
                    <xsl:copy-of select="ItemID"/>
                    <xsl:copy-of select="Color"/>
                    <MinQty><xsl:value-of select="sum(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))/MinQty)"/></MinQty>
                </xsl:copy>
            </xsl:for-each>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Output XML

<?xml version='1.0' ?>
<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>3</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
</Inventory>

Upvotes: 3

Related Questions