ABach
ABach

Reputation: 3738

Optimization of XSLT using Identity Transform

I recently ran across the following stock ticker XML feed:

<?xml version="1.0" encoding="utf-8"?>
<BloombergOutput>
  <BloombergOutput CreatedUtc="2011-08-11T20:40:50.8851936Z">
    <Instruments>
      <Instrument Symbol="BLL">
        <Fields>
          <Field1 Name="LastPrice">
            <Value>35.550000</Value>
          </Field1>
          <Field2 Name="NetChangeOneDay">
            <Value>+1.550000</Value>
          </Field2>
          <Field3 Name="LastCloseDate">
            <Value>08/11/2011</Value>
          </Field3>
          <Field4 Name="LastClosePrice">
            <Value>35.550000</Value>
          </Field4>
          <Field5 Name="UpdateDate">
            <Value>08/11/2011</Value>
          </Field5>
          <Field6 Name="UpdateTime">
            <Value>16:15:03</Value>
          </Field6>
          <Field7 Name="LongName">
            <Value>Ball Corp</Value>
          </Field7>
          <Field8 Name="Name">
            <Value>BALL CORP</Value>
          </Field8>
          <Field9 Name="PriceSource">
            <Value>US</Value>
          </Field9>
          <Field10 Name="SymbolType">
            <Value>Common Stock</Value>
          </Field10>
        </Fields>
      </Instrument>
    </Instruments>
  </BloombergOutput>
</BloombergOutput>

I wanted to use XSLT to transform this feed into something that didn't have the unnecessary tag nesting, had more descriptive element names, and truncated overly long numbers so they only had two numbers after the decimal point. Here's the XSLT I came up with:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="no" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!-- Identity Transform, modified to begin at the Instruments element -->
  <xsl:template match="BloombergOutput/BloombergOutput/Instruments/@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- For each instrument, we grab the Symbol attribute and work on each child element -->
  <xsl:template match="Instrument">
    <Instrument>
      <Symbol><xsl:value-of select="@Symbol" /></Symbol>
      <xsl:apply-templates select="Fields/*" mode="fields" />
    </Instrument>
  </xsl:template>

  <!-- For each child field, we create a newly-named one and give it a value -->
  <xsl:template match="node()" mode="fields">

    <xsl:variable
      name="FieldName"
      select="@Name" />
    <xsl:variable
        name="Value"
        select="Value" />

    <xsl:element name="{$FieldName}">
      <xsl:choose>
        <!-- For these fields, we only want to preserve to spots after the decimal point -->
        <xsl:when test="$FieldName='LastPrice' or $FieldName='NetChangeOneDay' or $FieldName='LastClosePrice'">
          <xsl:value-of select="concat(substring-before($Value, '.'), '.', substring(substring-after($Value, '.'), 1, 2))" />
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$Value" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:element>
  </xsl:template>
</xsl:stylesheet>

...which produces this output:

<?xml version="1.0"?>
<BloombergOutput>
  <BloombergOutput>2011-08-11T20:40:50.8851936Z
    <Instruments>
      <Instrument>
        <Symbol>BLL</Symbol>
        <LastPrice>35.55</LastPrice>
        <NetChangeOneDay>+1.55</NetChangeOneDay>
        <LastCloseDate>08/11/2011</LastCloseDate>
        <LastClosePrice>35.55</LastClosePrice>
        <UpdateDate>08/11/2011</UpdateDate>
        <UpdateTime>16:15:03</UpdateTime>
        <LongName>Ball Corp</LongName>
        <Name>BALL CORP</Name>
        <PriceSource>US</PriceSource>
        <SymbolType>Common Stock</SymbolType>
      </Instrument>
    </Instruments>
  </BloombergOutput>
</BloombergOutput>

Although this is nearly what I want, there are some issues:

  1. The additional BloombergOutput element at the top is retained; additionally, its CreatedUtc parameter is retained in a rather strange way. My original intent was to remove the unnecessary BloombergOutput tags altogether.
  2. I successfully inserted the addition of an Instrument tag. However, Instruments is retained without me expressly saying so. I get that the Identity Transform brought it along because I didn't tell it to go away, but what if I wanted different opening elements (say, StockQuote)?
  3. My intention was to get good at using the Identity Transform. However, I'm not certain that my match modification is the correct way to accomplish what I'm doing.

Overall, I'm looking for your expert advice on how to improve this. Feel free to tell me that I'm trying to jam in a design pattern where it doesn't belong. :)

Thanks so much.

Upvotes: 3

Views: 921

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243579

Good question, +1.

Here is a simpler, shorter and more concise solution (no variables, no xsl:choose / xsl:when / xsl:otherwise, no substring()):

<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>

     <xsl:template match="node()|@*">
         <xsl:copy>
           <xsl:apply-templates select="node()|@*"/>
         </xsl:copy>
     </xsl:template>

     <xsl:template match="BloombergOutput | Fields" priority="2">
      <xsl:apply-templates/>
     </xsl:template>

     <xsl:template match="*[starts-with(name(),'Field')]">
       <xsl:element name="{@Name}">
         <xsl:apply-templates/>
       </xsl:element>
     </xsl:template>

     <xsl:template match="Value">
      <xsl:apply-templates/>
     </xsl:template>

     <xsl:template match=
      "*[contains('|LastPrice|LastClosePrice|NetChangeOneDay|',
                  concat('|', @Name, '|')
                  )
        ]
          /Value
        ">

        <xsl:value-of select=
          "format-number(translate(.,'+', ''), '##0.00')"/>
     </xsl:template>
</xsl:stylesheet>

When this transformation is applied to the provided XML document:

<BloombergOutput>
  <BloombergOutput CreatedUtc="2011-08-11T20:40:50.8851936Z">
    <Instruments>
      <Instrument Symbol="BLL">
        <Fields>
          <Field1 Name="LastPrice">
            <Value>35.550000</Value>
          </Field1>
          <Field2 Name="NetChangeOneDay">
            <Value>+1.550000</Value>
          </Field2>
          <Field3 Name="LastCloseDate">
            <Value>08/11/2011</Value>
          </Field3>
          <Field4 Name="LastClosePrice">
            <Value>35.550000</Value>
          </Field4>
          <Field5 Name="UpdateDate">
            <Value>08/11/2011</Value>
          </Field5>
          <Field6 Name="UpdateTime">
            <Value>16:15:03</Value>
          </Field6>
          <Field7 Name="LongName">
            <Value>Ball Corp</Value>
          </Field7>
          <Field8 Name="Name">
            <Value>BALL CORP</Value>
          </Field8>
          <Field9 Name="PriceSource">
            <Value>US</Value>
          </Field9>
          <Field10 Name="SymbolType">
            <Value>Common Stock</Value>
          </Field10>
        </Fields>
      </Instrument>
    </Instruments>
  </BloombergOutput>
</BloombergOutput>

the wanted, correct result is produced:

<Instruments>
   <Instrument Symbol="BLL">
      <LastPrice>35.55</LastPrice>
      <NetChangeOneDay>1.55</NetChangeOneDay>
      <LastCloseDate>08/11/2011</LastCloseDate>
      <LastClosePrice>35.55</LastClosePrice>
      <UpdateDate>08/11/2011</UpdateDate>
      <UpdateTime>16:15:03</UpdateTime>
      <LongName>Ball Corp</LongName>
      <Name>BALL CORP</Name>
      <PriceSource>US</PriceSource>
      <SymbolType>Common Stock</SymbolType>
   </Instrument>
</Instruments>

Upvotes: 2

Vincent Biragnet
Vincent Biragnet

Reputation: 2998

I think that here is the mechanism you're looking for :

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="no" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- Get rid of the BloombergOutput, Instruments elements-->
    <xsl:template match="BloombergOutput|Instruments">
        <xsl:apply-templates/>
    </xsl:template>
    <!-- Identity Transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!-- For each instrument, we grab the Symbol attribute and work on each child element -->
    <xsl:template match="Instrument">
        <Instrument>
            <Symbol><xsl:value-of select="@Symbol" /></Symbol>
            <xsl:apply-templates select="Fields/*" />
        </Instrument>
    </xsl:template>

    <!-- For each child field, we create a newly-named one and give it a value -->
    <xsl:template match="*[starts-with(name(),'Field')]">

        <xsl:variable
            name="FieldName"
            select="@Name" />
        <xsl:variable
            name="Value"
            select="Value" />

        <xsl:element name="{$FieldName}">
            <xsl:choose>
                <!-- For these fields, we only want to preserve to spots after the decimal point -->
                <xsl:when test="$FieldName='LastPrice' or $FieldName='NetChangeOneDay' or $FieldName='LastClosePrice'">
                    <xsl:value-of select="concat(substring-before($Value, '.'), '.', substring(substring-after($Value, '.'), 1, 2))" />
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="$Value" />
                </xsl:otherwise>
            </xsl:choose>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

Note that you don't have to change the identity template. The goal of this template is to say : whenever you don't know what to do, stay with what already exists.

For the rest, in your case you don't need mode, you just need :

  1. For elements like Instruments or BloombergOutput : continue without creating any kind of structure
  2. do specific tasks for elements that start with Field.

The result is :

<?xml version="1.0" encoding="utf-8"?>
<Instrument>
   <Symbol>BLL</Symbol>
   <LastPrice>35.55</LastPrice>
   <NetChangeOneDay>+1.55</NetChangeOneDay>
   <LastCloseDate>08/11/2011</LastCloseDate>
   <LastClosePrice>35.55</LastClosePrice>
   <UpdateDate>08/11/2011</UpdateDate>
   <UpdateTime>16:15:03</UpdateTime>
   <LongName>Ball Corp</LongName>
   <Name>BALL CORP</Name>
   <PriceSource>US</PriceSource>
   <SymbolType>Common Stock</SymbolType>
</Instrument>

One more remark, if you have two Instrument elements, the result of the transform won't be well formed.

Upvotes: 3

Related Questions