atif
atif

Reputation: 1147

sweeping xml file content and creating new elements

I have an xml file

<xml>
 <head>
 <title>Test</title>
  </head>
 <body>
     <para>
     This is a body text meta data 1234, this is a external link R12345.  This is a test para.
     </para>
   </body>
 </xml>

I need a script which will sweep the content of the body/para text to look for “meta data”, "external link" with the number that follows these words and convert them in to link in the head section.

<xml>
 <head>
 <title>Test</title>
 <link name="meta data" id="1234"/>
 <link name="external link" id="R1234"/>
  </head>
 <body>
     <para>
     This is a body text meta data 1234, this is a external link R12345. This is a test para. 
     </para>
   </body>
 </xml>

I have done it using the c# programe but want to do it using xslt 1.0 as I have few other transformation that will be running on same file so wanted to do this in xslt.

Upvotes: 1

Views: 189

Answers (1)

Tim C
Tim C

Reputation: 70638

There is a slighty inconsistency in your question, because you say you want to find the number that follows the tags, but in your example, the number for your external is R1234 which obviously contains a letter!

However, I came up with the following template which can be used to 'sweep' for your tag

<xsl:template name="sweeper">
  <xsl:param name="text"/>
  <xsl:param name="tag"/>

  <xsl:variable name="search" select="normalize-space(concat(substring-after($text, $tag), '.'))"/>
  <xsl:variable name="delimiter" select="substring(translate($search, 'R1234567890', ''), 1, 1)"/>
  <xsl:variable name="match" select="substring-before($search, $delimiter)"/>
  <xsl:if test="$match != ''">
     <link name="{$tag}" id="{$match}"/>
  </xsl:if>
</xsl:template>

(Where text is the text to search, and tag is the tag for which to sort.)

What the template does is firstly get the text 'after' the tag for which your are searching. It then removes all numbers from this string, as well as R to cope with your requirements (if other letters are valid, add them here). Then it gets the text that occurs before the first character of this truncated text, which should hopefully be the number you want.

Here is the full XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="head">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
         <xsl:apply-templates select="//para" mode="sweep"/>
      </xsl:copy>
   </xsl:template>

   <xsl:template match="para" mode="sweep">
      <xsl:call-template name="sweeper">
         <xsl:with-param name="text" select="."/>
         <xsl:with-param name="tag" select="'meta data'"/>
      </xsl:call-template>
      <xsl:call-template name="sweeper">
         <xsl:with-param name="text" select="."/>
         <xsl:with-param name="tag" select="'external link'"/>
      </xsl:call-template>
   </xsl:template>

   <xsl:template name="sweeper">
      <xsl:param name="text"/>
      <xsl:param name="tag"/>

      <xsl:variable name="search" select="normalize-space(concat(substring-after($text, $tag), '.'))"/>
      <xsl:variable name="delimiter" select="substring(translate($search, 'R1234567890', ''), 1, 1)"/>
      <xsl:variable name="match" select="substring-before($search, $delimiter)"/>
      <xsl:if test="$match != ''">
         <link name="{$tag}" id="{$match}"/>
      </xsl:if>
   </xsl:template>

   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

When applied to your input XML, the following it output:

<xml>
<head>
<title>Test</title>
<link name="meta data" id="1234" />
<link name="external link" id="R12345" />
</head>
<body>
<para> 
     This is a body text meta data 1234, this is a external link R12345.  This is a test para. 
     </para>
</body>
</xml>

Upvotes: 1

Related Questions