sparty02
sparty02

Reputation: 606

Mapping processing instructions in the current scope in XSL

We have a process in which a graphical tool has been used to define mapping transforms from xml->xml. This tool spits out very verbose (not necessarily elegant) xsl transforms that make heavy use of for-each and copy-of, changing fields as needed within the for-each scope of each element. Here is an example:

An XSL might look something like:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />

  <xsl:template match="/root">
    <root>
      <xsl:for-each select="data">
        <data>
          <xsl:copy-of select="id" />
          <xsl:copy-of select="comment" />
          <detail>
            <xsl:for-each select="detail">
              <xsl:copy-of select="name" />
              <preference>
                <xsl:value-of select="favorite" />
              </preference>
            </xsl:for-each>
          </detail>
        </data>
      </xsl:for-each>
    </root>
  </xsl:template>

</xsl:stylesheet>

This transform is:

Given this transform (reference in http://xsltransform.net/a9GPgr/1), the following source xml:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <id>123</id>
    <comment>hello</comment>
    <comment>goodbye</comment>
    <text>bar</text>
    <detail>
      <name>bob</name>
      <favorite>cake</favorite>
    </detail>
  </data>
</root>

will result in this target xml:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <id>123</id>
    <comment>hello</comment>
    <comment>goodbye</comment>
    <detail>
      <name>bob</name>
      <preference>cake</preference>
    </detail>
  </data>
</root>

Now, the source xml has evolved to include processing instructions (that are used downstream to decide how to further transform the xml downstream). The source xml now looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple comment?>
    <?format-string comment?>
    <comment>hello</comment>
    <?format-string comment?>
    <comment>goodbye</comment>
    <?format-string text?>
    <text>bar</text>
    <detail>
      <?format-string name?>
      <name>bob</name>
      <?format-string favorite?>
      <favorite>cake</favorite>
    </detail>
    <?xml-multiple colors?>
  </data>
</root>

Some insight:

Now, the challenge is that these processing instructions aren't carried through when running the xml transform above. Instead of doing mapping changes in the tool, I have a program that changes the original xsl and can add new content to it in well-known locations (i.e. when it sees that its using for-each and friends).

In this respect, I need to figure out what the "new" xsl should look like in order to maintain thsese processing instructions. My mental model for what needs to get changed is:

  1. when copying over content, ensure that you also copy over all processing instructions between the "previous" element and the "current element". For the following snippet, the <id> element is annotated with the <?format-number id?> processing instruction, the first <comment> element is annotated with two processing instructions (<?xml-multiple comment?> and <?format-string comment?>).
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple comment?>
    <?format-string comment?>
    <comment>hello</comment>
    <?format-string comment?>
    <comment>goodbye</comment>
  </data>
</root>
  1. ensure that processing instructions that don't have an associated target element are also copied over. For the following snippet, the <id> element is annotated with the <?format-number id?> processing instruction and there is an orphaned processing instruction at the end that doesn't annotate anything (<?xml-multiple colors?>).
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple colors?>
  </data>
</root>

Also note that the orphaned xml-multiple doesn't have to be at the end, it could be between elements, as in:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple colors?>
    <?format-string text?>
    <text>bar</text>
  </data>
</root>

An XSL might look something like the following (notice that I can "inject" some xsl at each logical place where an element is being processed in scope):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />

  <xsl:template match="/root">
    <root>
      <xsl:for-each select="data">
        <data>
          <!-- what would I do here? some usage of preceding-sibling and processing instruction maybe? -->
          <xsl:copy-of select="???" />
          <xsl:copy-of select="id" />
          <!-- same -->
          <xsl:copy-of select="???" />
          <xsl:copy-of select="comment" />
          <detail>
            <!-- same -->
            <xsl:copy-of select="???" />
            <xsl:for-each select="detail">
              <!-- same -->
              <xsl:copy-of select="???" />
              <xsl:copy-of select="name" />
              <!-- same -->
              <xsl:copy-of select="???" />
              <preference>
                <xsl:value-of select="favorite" />
              </preference>
            </xsl:for-each>
          </detail>
        </data>
      </xsl:for-each>
    </root>
  </xsl:template>

</xsl:stylesheet>

Based on the transform, the result should be:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple comment?>
    <?format-string comment?>
    <comment>hello</comment>
    <?format-string comment?>
    <comment>goodbye</comment>
    <detail>
      <?format-string name?>
      <name>bob</name>
      <?format-string preference?>
      <preference>cake</preference>
    </detail>
    <?xml-multiple colors?>
  </data>
</root>

Upvotes: 1

Views: 56

Answers (2)

Michael Kay
Michael Kay

Reputation: 163322

I would suggest a different approach. Start with a hand-written XSLT stylesheet that converts the XML to a more conventional form. Then use the mapping tool to generate a transformation on this XML; then post-process the result of that mapping tool with another XSLT stylesheet that reinstates the processing instructions.

Trying to automate the conversion of generated XSLT code so it does a different transformation from the one it was designed to do seems -- let's say -- challenging.

(Personally, my experience of using mapping tools to generate XSLT is very negative; it's more trouble than it's worth, and you'd be better off writing the XSLT code by hand. But your mileage may vary...)

Upvotes: 0

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22177

Here is an example of using of the Identity Transform pattern.

The idea of it is very simple. You deal with nodes that need to be removed or modified. Everything else is copied to the output by itself without explicit referencing nodes one by one.

Input XML

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <data>
        <?format-number id?>
        <id>123</id>
        <?xml-multiple comment?>
        <?format-string comment?>
        <comment>hello</comment>
        <?format-string comment?>
        <comment>goodbye</comment>
        <?format-string text?>
        <text>bar</text>
        <detail>
            <?format-string name?>
            <name>bob</name>
            <?format-string favorite?>
            <favorite>cake</favorite>
        </detail>
        <?xml-multiple colors?>
    </data>
</root>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="utf-8" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--remove text element-->
    <xsl:template match="text"/>

    <!--remove text element processing-instruction-->
    <xsl:template match="processing-instruction('format-string')
                    [contains(., 'text')]"/>

    <!--change favorite element to preference-->
    <xsl:template match="favorite">
        <preference>
            <xsl:value-of select="."/>
        </preference>
    </xsl:template>
</xsl:stylesheet>

Output

<?xml version='1.0' encoding='utf-8' ?>
<root>
  <data>
    <?format-number id?>
    <id>123</id>
    <?xml-multiple comment?>
    <?format-string comment?>
    <comment>hello</comment>
    <?format-string comment?>
    <comment>goodbye</comment>
    <detail>
      <?format-string name?>
      <name>bob</name>
      <?format-string favorite?>
      <preference>cake</preference>
    </detail>
    <?xml-multiple colors?>
  </data>
</root>

Upvotes: 1

Related Questions