KM6895
KM6895

Reputation: 89

Group Child Nodes based on Specific Condition

I have been having trouble transforming below xml using XSLT 2.0. Could anyone help me resolve this please? Thanks

Input XML

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Row>
        <EmployeeID>88888</EmployeeID>
        <Remote/>
        <StartDate>2014-10-19</StartDate>
        <EndDate>2021-10-08</EndDate>
        <Amount>100</Amount>
        <BasePay>760</BasePay>
        <Bonus>340</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>250</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate></EndDate>
        <Amount></Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2018-10-24</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>350</Amount>
        <BasePay>1600</BasePay>
        <Bonus>170</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate/>
        <EndDate></EndDate>
        <Amount></Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate>2021-10-25</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>8400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
</Root>

Desired Output

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Row>
        <EmployeeID>88888</EmployeeID>
        <Remote/>
        <StartDate>2014-10-19</StartDate>
        <EndDate>2021-10-08</EndDate>
        <Amount>100</Amount>
        <BasePay>760</BasePay>
        <Bonus>340</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>250</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2018-10-24</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>  
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>350</Amount>
        <BasePay>1600</BasePay>
        <Bonus>170</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate>2021-10-25</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>8400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>    
</Root>

Grouping needs to be based on <Remote> element with value either Y or N.

For e.g. Employee 23456 appears thrice in the input xml with <Remote>set to N – Result xml is expected to have Employee 23456 only once but expected to pick up values for StartDate(i.e 2018-10-24) EndDate (i.e 2020-10-10) Amount(i.e. 2400)

For e.g. Employee 65432 appears twice in the input xml with value <Remote> set to Y – Result xml is expected to have Employee 65432 only once but expected to pick up values for StartDate (i.e 2021-10-25) EndDate (i.e 2020-10-10) Amount (8400)

Employee 12345 appears twice in the input xml with blank value for <Remote> - Result xml is expected to have Employee 12345 twice because <BasePay> and <Bonus> amounts are different.

Below is my attempt

<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
        <xsl:output method="xml" indent="yes"></xsl:output>
        
       <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>
        
      <xsl:template match="/Root">
          <xsl:copy>
              <xsl:for-each-group select="Row" group-by="EmployeeID/[Mobile='Y' or 'N']">
                <Row>
                    <EmployeeID>
                        <xsl:value-of select="EmployeeID"/>
                    </EmployeeID>
                    <Remote>
                        <xsl:value-of select="Remote"/>
                    </Remote>
                    <StartDate>
                        <xsl:value-of select="StartDate"/>
                    </StartDate>
                    <EndDate>
                        <xsl:value-of select="EndDate"/>
                    </EndDate>
                    <Amount>
                        <xsl:value-of select="Amount"/>
                    </Amount>
                    <BasePay>
                        <xsl:value-of select="BasePay"/>
                    </BasePay>
                    <Bonus>
                        <xsl:value-of select="Bonus"/>
                    </Bonus>                    
                </Row>
            </xsl:for-each-group>
          </xsl:copy>
        </xsl:template>
    </xsl:stylesheet>

Upvotes: 0

Views: 108

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167716

Perhaps (using a composite grouping key as possible in XSLT 3)

  <xsl:template match="/Root">
      <xsl:copy>
          <xsl:for-each-group select="Row" composite="yes" group-by="EmployeeID, Remote = ('Y', 'N')">
            <xsl:choose>
              <xsl:when test="current-grouping-key()[2]">
                <xsl:apply-templates select="."/>
              </xsl:when>
              <xsl:otherwise>
                <xsl:apply-templates select="current-group()"/>
              </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each-group>
      </xsl:copy>
    </xsl:template>

Or using a composite key and then a match pattern that removes/strips the duplicates is easier to preserve the original order:

  <xsl:key name="group" match="Root/Row" composite="yes" use="EmployeeID, Remote = ('Y', 'N')"/>

  <xsl:template match="Root/Row[. intersect key('group', (EmployeeID, true()))[position() > 1]]"/>
  
  <xsl:output method="xml" indent="yes"/>

  <xsl:mode on-no-match="shallow-copy"/>

As for your comment about empty children, perhaps

  <xsl:template match="Root/Row[. intersect key('group', (EmployeeID, true()))[1]]">
    <xsl:copy>
      <xsl:for-each-group select="key('group', (EmployeeID, true()))/*[node()]" group-by="node-name()">
        <xsl:apply-templates select="."/>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

ensures the non-empty child elements in a group of Rows are output instead of the child elements of the first item in a group.

An online sample is here.

Upvotes: 1

Related Questions