jrpartridge
jrpartridge

Reputation: 13

XSLT Rename & Add Elements

Purpose

I am trying to write a stylesheet that will convert instance documents corresponding to version 1 of the schema into version 2 of the schema. There are about 300 elements so I don't want to write a bunch of templates. The vast majority of the differences between the versions are a bunch of standard renames like removing prefixes, dropping underscores between a word ending with a lowercase and an adjacent word, etc. However I also need to accommodate adding and deleting elements.

Renaming Rules

  1. Replace all underscore characters UNLESS they separate adjacent uppercase letters.
  2. Remove all instances of '_of_' and '_to_Value'.
  3. Replace all instances of '_or_' with '_Or_'.
  4. Remove all instances of 'Header_' EXCEPT for the 'Header_' preceding 'Header_Parties' and 'Header_Party'. They should end up as 'HeaderParties' and 'HeaderParty'.

Elements to Add

  1. <SentSequence> after <StatusCode>

Elements to Delete

  1. <Line_Order>

Source XML

<?xml version="1.0" encoding="UTF-8"?>
<Entries>
    <Entry>
        <Entry_Number>10158271304</Entry_Number>
        <CHB_File>63475017024503000</CHB_File>
        <Traffic_File>1017271467</Traffic_File>
        <Status_Code>A</Status_Code>
        <Header_Country_of_Origin>VN</Header_Country_of_Origin>
        <Importer_or_Owner>Owner</Importer_or_Owner>
        <Entry_Total_Additions_to_Value>.00</Entry_Total_Additions_to_Value>
        <Entry_IRS_Excise_Tax>.00</Entry_IRS_Excise_Tax>
        <Entry_AD_Duties>.00</Entry_AD_Duties>
        <Entry_CV_Duties>.00</Entry_CV_Duties>
        <Header_Parties>
            <Header_Party>
                <Header_Party_Type>Importer</Header_Party_Type>
            </Header_Party>
        </Header_Parties>
        <Invoices>
            <Invoice>
                <Invoice_Order>1</Invoice_Order>
                <Invoice_Lines>
                    <Invoice_Line>
                        <Line_Order>1</Line_Order>
                        <Line_Quantity>685</Line_Quantity>
                        <Entry_Lines>
                            <Entry_Line>
                                <CBP7501_Line>1</CBP7501_Line>
                            </Entry_Line>
                        </Entry_Lines>
                    </Invoice_Line>
                </Invoice_Lines>
            </Invoice>
        </Invoices>
    </Entry>
</Entries>

Target XML

<?xml version="1.0" encoding="UTF-8"?>
<Entries>
    <Entry>
        <EntryNumber>10158271304</EntryNumber>
        <CHB_File>63475017024503000</CHB_File>
        <TrafficFile>1017271467</TrafficFile>
        <StatusCode>A</StatusCode>
        <SentSequence>1</SentSequence>
        <CountryOrigin>VN</CountryOrigin>
        <ImporterOrOwner>Owner</ImporterOrOwner>
        <Entry_Total_Additions_to_Value>.00</Entry_Total_Additions_to_Value>
        <IRS_ExciseTax>.00</IRS_ExciseTax>
        <AD_Duties>.00</AD_Duties>
        <CV_Duties>.00</CV_Duties>
        <HeaderParties>
            <HeaderParty>
                <PartyType>Importer</PartyType>
            </HeaderParty>
        </HeaderParties>
        <Invoices>
            <Invoice>
                <InvoiceOrder>1</InvoiceOrder>
                <InvoiceLines>
                    <InvoiceLine>
                        <LineQuantity>685</LineQuantity>
                        <EntryLines>
                            <EntryLine>
                                <CBP7501Line>1</CBP7501_Line>
                            </EntryLine>
                        </EntryLines>
                    </InvoiceLine>
                </InvoiceLines>
            </Invoice>
        </Invoices>
    </Entry>
</Entries>

Progress to Date

I've found a couple different ways of handling the renames, from using a function (my initial thought) to multiple templates. The most promising approach I found use a second stylesheet. I used one derived from Example 8-9. in O'Reilly's XSLT Cookbook 2nd Edition. However I liked the generic external one but couldn't get it to work. The one I used seems most amenable to also handling the additions and renames. However I'm fine with a mix as specifying all the renames is a bit inefficient. In my case I have the documentation in an Excel spreadsheet so I use it to generate the contents of the mapping stylesheet.

Stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
    xmlns:kn="http://us.customsbrokerage.net/kn/cb/xsd/v1.0/functions"
    xmlns:ren="http://www.ora.com/namespaces/rename"
    xmlns:cvt="my:convert"
    exclude-result-prefixes="xs math xd"
    version="3.0">

    <!-- Mapping stylesheet -->
    <cvt:convert>
        <element rename="true" curName="Entry_Number" new="EntryNumber"/>
        <element rename="true" curName="Traffic_File" new="TrafficFile"/>
        <element rename="true" curName="Status_Code" new="StatusCode"/>
        <element add="true" curName="Status_Code" new="SentSequence"/>
        <element rename="true" curName="Header_Country_of_Origin" new="CountryOriginCode"/>
        <element rename="true" curName="Importer_or_Owner" new="ImporterOrOwner"/>
        <element rename="true" curName="Entry_Total_Additions_to_Value" new="TotalAdditions"/>
        <element rename="true" curName="Entry_IRS_Excise_Tax" new="IRS_ExciseTax"/>
        <element rename="true" curName="Entry_AD_Duties" new="AD_Duties"/>
        <element rename="true" curName="Entry_CV_Duties" new="CV_Duties"/>
        <element rename="true" curName="Header_Parties" new="HeaderParties"/>
        <element rename="true" curName="Header_Party" new="HeaderParty"/>
        <element rename="true" curName="Header_Party_Type" new="CBP_PartyType"/>
        <element rename="true" curName="Invoice_Lines" new="InvoiceLines"/>
        <element rename="true" curName="Invoice_Line" new="InvoiceLine"/>
        <element rename="true" curName="Invoice_Order" new="InvoiceOrder"/>
        <element rename="true" curName="Line_Order" new=""/>
        <element rename="true" curName="Line_Quantity" new="LineQuantity"/>
        <element rename="true" curName="Entry_Lines" new="EntryLines"/>
        <element rename="true" curName="Entry_Line" new="EntryLine"/>
        <element rename="true" curName="CBP7501_Line" new="CBP7501Line"/>
    </cvt:convert>
    <xsl:output method="xml" encoding="UTF-8" byte-order-mark="no" indent="true" version="1.0"/>

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match=
        "*[name()=document('')/*/cvt:convert/element/@curName]">
        <xsl:variable name="convertNode" select="*[name()=document('')/*/cvt:convert/element/@curName]"/>
        <xsl:if test="$convertNode/@rename='true'">
            <xsl:element name=
                "{document('')/*/cvt:convert/element
                [@curName=name(current())]
                /@new}">
            </xsl:element>
        </xsl:if>
        <xsl:if test="*[name()=document('')/*/cvt:convert/element/@curName]/@add=true()">
            <xsl:element name=
                "{document('')/*/cvt:convert/element
                [@curName=name(current())]
                /@new}">
            </xsl:element>
        </xsl:if>
        <xsl:if test="*[name()=document('')/*/cvt:convert/element/@delete]=true()"/>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:template>
</xsl:stylesheet>

The closest question to mine on SO appears to be here. However I don;t understand the approach enough to port it to my case. In addition it requires XSLT 3.0 or a two-step process using XSLT 2.0. I would prefer a one step process using XSLT 2.0 but I could go to XSLT 3.0 if I must.

Solution Addendum

For everyone's reference I used the XSLT 3.0 approach with the changes below. See XSLT 3.0 solution changes.

Upvotes: 1

Views: 499

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167696

I have used your mapping table, but only for the renaming, for elements to be added or to be deleted I have implemented templates directly; also I have just used a variable for the mapping table instead of a top-level element as reading in the stylesheet with document('') is technique more needed in XSLT 1 in my view.

Resulting XSLT 3 is

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    xmlns:map="http://www.w3.org/2005/xpath-functions/map"
    xmlns:array="http://www.w3.org/2005/xpath-functions/array"
    exclude-result-prefixes="xs math map array"
    version="3.0">

  <xsl:param name="rename-map">
        <element rename="true" curName="Entry_Number" new="EntryNumber"/>
        <element rename="true" curName="Traffic_File" new="TrafficFile"/>
        <element rename="true" curName="Status_Code" new="StatusCode"/>
        <element rename="true" curName="Header_Country_of_Origin" new="CountryOriginCode"/>
        <element rename="true" curName="Importer_or_Owner" new="ImporterOrOwner"/>
        <element rename="true" curName="Entry_Total_Additions_to_Value" new="TotalAdditions"/>
        <element rename="true" curName="Entry_IRS_Excise_Tax" new="IRS_ExciseTax"/>
        <element rename="true" curName="Entry_AD_Duties" new="AD_Duties"/>
        <element rename="true" curName="Entry_CV_Duties" new="CV_Duties"/>
        <element rename="true" curName="Header_Parties" new="HeaderParties"/>
        <element rename="true" curName="Header_Party" new="HeaderParty"/>
        <element rename="true" curName="Header_Party_Type" new="CBP_PartyType"/>
        <element rename="true" curName="Invoice_Lines" new="InvoiceLines"/>
        <element rename="true" curName="Invoice_Line" new="InvoiceLine"/>
        <element rename="true" curName="Invoice_Order" new="InvoiceOrder"/>
        <element rename="true" curName="Line_Quantity" new="LineQuantity"/>
        <element rename="true" curName="Entry_Lines" new="EntryLines"/>
        <element rename="true" curName="Entry_Line" new="EntryLine"/>
        <element rename="true" curName="CBP7501_Line" new="CBP7501Line"/>
  </xsl:param>

  <xsl:key name="map-ref" match="element[@rename = 'true']" use="@curName"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="Line_Order"/>

  <xsl:template match="Status_Code">
      <xsl:next-match/>
      <SentSequence>1</SentSequence>
  </xsl:template>

  <xsl:template match="*[key('map-ref', local-name(), $rename-map)]">
      <xsl:element name="{key('map-ref', local-name(), $rename-map)/@new}">
          <xsl:apply-templates/>
      </xsl:element>
  </xsl:template>

</xsl:stylesheet>

Online at https://xsltfiddle.liberty-development.net/bFukv8t, if you target an XSLT 2 processor replace the xsl:mode instruction used above with the identity transformation template

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template>

If you want to take the additions and deletions also from that mapping data you can use

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="3.0">

  <xsl:param name="rename-map">
       <element rename="true" curName="Entry_Number" new="EntryNumber"/>
        <element rename="true" curName="Traffic_File" new="TrafficFile"/>
        <element rename="true" curName="Status_Code" new="StatusCode"/>
        <element add="true" curName="Status_Code" new="SentSequence"/>
        <element rename="true" curName="Header_Country_of_Origin" new="CountryOriginCode"/>
        <element rename="true" curName="Importer_or_Owner" new="ImporterOrOwner"/>
        <element rename="true" curName="Entry_Total_Additions_to_Value" new="TotalAdditions"/>
        <element rename="true" curName="Entry_IRS_Excise_Tax" new="IRS_ExciseTax"/>
        <element rename="true" curName="Entry_AD_Duties" new="AD_Duties"/>
        <element rename="true" curName="Entry_CV_Duties" new="CV_Duties"/>
        <element rename="true" curName="Header_Parties" new="HeaderParties"/>
        <element rename="true" curName="Header_Party" new="HeaderParty"/>
        <element rename="true" curName="Header_Party_Type" new="CBP_PartyType"/>
        <element rename="true" curName="Invoice_Lines" new="InvoiceLines"/>
        <element rename="true" curName="Invoice_Line" new="InvoiceLine"/>
        <element rename="true" curName="Invoice_Order" new="InvoiceOrder"/>
        <element rename="true" curName="Line_Order" new=""/>
        <element rename="true" curName="Line_Quantity" new="LineQuantity"/>
        <element rename="true" curName="Entry_Lines" new="EntryLines"/>
        <element rename="true" curName="Entry_Line" new="EntryLine"/>
        <element rename="true" curName="CBP7501_Line" new="CBP7501Line"/>
  </xsl:param>

  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>

  <xsl:key name="map-ref" match="element[@rename = 'true']" use="@curName"/>
  <xsl:key name="new-ref" match="element[@add = 'true']" use="@curName"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="*[key('map-ref', local-name(), $rename-map)[@new = '']]" priority="5"/>

  <xsl:template match="Status_Code">
      <xsl:next-match/>
      <SentSequence>1</SentSequence>
  </xsl:template>

  <xsl:template match="*[key('map-ref', local-name(), $rename-map)]">
      <xsl:element name="{key('map-ref', local-name(), $rename-map)/@new}">
          <xsl:apply-templates/>
      </xsl:element>
      <xsl:apply-templates select=".[key('new-ref', local-name(), $rename-map)]" mode="new"/>
  </xsl:template>

  <xsl:template match="*" mode="new">
      <xsl:element name="{key('new-ref', local-name(), $rename-map)/@new}">1</xsl:element>
  </xsl:template>

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/bFukv8t/1

Upvotes: 1

Related Questions