joesch
joesch

Reputation: 347

Duplication of nodes when applying an XSLT

I'm trying to use XSLT 1.0 to transform a node taking it's value and copying it to a new node and then using one of its attributes as it's value, but it is creating multiple nodes instead of overwriting one and creating one additional, how do I fix this?

Below is an example portion of data whereby the node I'm interested in is 'unitid'. The value of this node should be copied to a new node, 'physloc', and 'unitid' value changed to the 'id' attribute.

Existing input:

<c03 level="subseries">
              <did>
                 <unitid encodinganalog="isadg311" id="cnda.94.02.02">D1148/2/2</unitid>
                 <unittitle encodinganalog="isadg312 marc245">Bundle</unittitle>
                 <unitdate encodinganalog="isadg313 marc260" normal="19350701/19750108">1 Jul 1935-8 Jan 1975; n.d.</unitdate>
                 <physdesc encodinganalog="isadg315 marc300">
                    <extent>1 bundle; 11 items.</extent>
                    <genreform />
                    <physfacet />
                 </physdesc>
              </did>

Desired result:

<c03 level="subseries">
          <did>
             <unitid encodinganalog="isadg311">cnda.94.02.02</unitid>
             <unittitle encodinganalog="isadg312 marc245">Bundle</unittitle>
             <unitdate encodinganalog="isadg313 marc260" normal="19350701/19750108">1 Jul 1935-8 Jan 1975; n.d.</unitdate>
             <physdesc encodinganalog="isadg315 marc300">
                <extent>1 bundle; 11 items.</extent>
                <genreform />
                <physfacet />
             </physdesc>
             <physloc>D1148/2/2</physloc>
          </did>

A few general problems, not all the 'unitid' nodes have an 'id' attribute and where this is the case the 'unitid' value should be preserved. Also, there are some records where this 'id' attribute is in the 'did' node e.g

<did id="uni.07.01.06.14">
  <unitid encodinganalog="isadg311">P4A/5/9</unitid>

And in this case the 'id' attribute from the 'did' node should become the value of 'unitid' with the value of 'unitid' being set to 'physloc' e.g.

<did>
  <unitid encodinganalog="isadg311">uni.07.01.06.14</unitid>
  <physloc>P4A/5/9</physloc>

My current stylesheet is this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:output method="xml" version="1.0" encoding="ISO-8859-1" indent="yes"/>
<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*" />
    </xsl:copy>
</xsl:template>
<xsl:template match="did">
    <xsl:param name="physid" select="unitid"/>
    <xsl:param name="eadid">
        <xsl:choose>
            <xsl:when test="@id">
                <xsl:value-of select="@id" />
            </xsl:when>
            <xsl:when test="unitid/@id">
                <xsl:value-of select="unitid/@id"/>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="unitid"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:param>
    <xsl:copy>
        <xsl:copy-of select="*" />
        <physloc><xsl:value-of select="$physid" /></physloc>
        <unitid><xsl:value-of select="$eadid" /></unitid>
    </xsl:copy>
</xsl:template>

For records where there is no 'id' attribute in the 'unitid' the result is this:

<did>
  <repository encodinganalog="marc852">
    <corpname>
           University of Liverpool,
           <subarea>Special Collections and Archives</subarea>      </corpname>
  </repository>
  <unitid countrycode="GB" repositorycode="141" encodinganalog="isadg311">CNDA</unitid>
  <physloc>CNDA</physloc>
  <unitid>CNDA</unitid>
  <physloc>CNDA</physloc>
  <unitid>CNDA</unitid>

Where there is an 'id' in the 'unitid':

<did>
          <unitid encodinganalog="isadg311" id="cnda.94.01.17">D1148/1/17</unitid>
          <unittitle encodinganalog="isadg312 marc245">Poem</unittitle>
          <unitdate encodinganalog="isadg313 marc260" normal="19000101/19591231">n.d.</unitdate>
          <physdesc encodinganalog="isadg315 marc300">
            <extent>1 item; 2 pieces.</extent>
            <genreform/>
            <physfacet/>
          </physdesc>
          <physloc>D1148/1/17</physloc>
          <unitid>cnda.94.01.17</unitid>
          <physloc>D1148/1/17</physloc>
          <unitid>cnda.94.01.17</unitid>
        </did>

And where there was an 'id' in the 'did' node:

 <did>
          <unitid encodinganalog="isadg311">D1148/2/2</unitid>
          <unittitle encodinganalog="isadg312 marc245">Bundle</unittitle>
          <unitdate encodinganalog="isadg313 marc260" normal="19350701/19750108">1 Jul 1935-8 Jan 1975; n.d.</unitdate>
          <physdesc encodinganalog="isadg315 marc300">
            <extent>1 bundle; 11 items.</extent>
            <genreform/>
            <physfacet/>
          </physdesc>
          <physloc>D1148/2/2</physloc>
          <unitid>cnda.94.02.02</unitid>
          <physloc>D1148/2/2</physloc>
          <unitid>D1148/2/2</unitid>
        </did>

How do I change my stylesheet so that I only get one 'unitid' and 'physloc' with the correct value? The XML files I'm working with are nested and the 'unitid' nodes can appear at multiple levels.

Upvotes: 0

Views: 43

Answers (2)

Tomalak
Tomalak

Reputation: 338396

Your transformation breaks down into these basic rules:

  1. everything should be copied as is, except
  2. the unitid/@id attribute should be removed
  3. the new unitid text value should be taken from the @id attribute
  4. a new physloc element should be created and receive the old unitid text value

Each of these rules can be converted into a separate template that carries it out.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="ISO-8859-1" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- 1. identity template copies everything as is -->
    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*" />
        </xsl:copy>
    </xsl:template>

    <!-- 2. if there is an id attribute, delete it -->
    <xsl:template match="unitid/@id" />

    <!-- 3. move the value of the id attribute into the element value -->
    <xsl:template match="unitid[@id]/text()">
        <xsl:value-of select="../@id" />
    </xsl:template>

    <!-- 4. create a <physloc> element inside <did> -->
    <xsl:template match="did[not(physloc)]">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*" />
            <physloc><xsl:value-of select="unitid" /></physloc>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

With this stylesheet, your input is transformed into the desired output:

<c03 level="subseries">
   <did>
      <unitid encodinganalog="isadg311">cnda.94.02.02</unitid>
      <unittitle encodinganalog="isadg312 marc245">Bundle</unittitle>
      <unitdate encodinganalog="isadg313 marc260" normal="19350701/19750108">1 Jul 1935-8 Jan 1975; n.d.</unitdate>
      <physdesc encodinganalog="isadg315 marc300">
         <extent>1 bundle; 11 items.</extent>
         <genreform/>
         <physfacet/>
      </physdesc>
      <physloc>D1148/2/2</physloc>
   </did>
</c03>

Note how the teplates 2 and 3 only match the specific case where there is an @id attribute. If there is none the <unitid> element is simply copied as-is. In other words, if the input already is in the desired format, nothing happens.

Upvotes: 1

joesch
joesch

Reputation: 347

Had another look at my stylesheet and it looks like it was the <xsl:copy> statement in the template. Removing this has done the trick.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:strip-space elements="*"/>
    <xsl:output method="xml" version="1.0" encoding="ISO-8859-1" indent="yes"/>
    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*" />
        </xsl:copy>
    </xsl:template>
    <xsl:template match="did">
        <xsl:param name="physid" select="unitid"/>
        <xsl:param name="eadid">
            <xsl:choose>
                <xsl:when test="@id">
                    <xsl:value-of select="@id" />
                </xsl:when>
                <xsl:when test="unitid/@id">
                    <xsl:value-of select="unitid/@id"/>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="unitid"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:param>
        <xsl:copy-of select="*" />
        <physloc><xsl:value-of select="$physid" /></physloc>
        <unitid><xsl:value-of select="$eadid" /></unitid>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 1

Related Questions