Bilzac
Bilzac

Reputation: 555

Remove special characters from XML via XSLT only for specific tags

I am having a certian issue with special characters in my XML. Bascially I am splitting up an xml into multiple xmls using Xalan Processor.

When splitting the documents up I am using their value of the name tag as the name of the file generated. The problem is that the name contains characters that arent recognized by the XML processor like ™ (TM) and ® (R). I want to remove those characters ONLY when naming the files.

<xsl:template match="products">
    <redirect:write select="concat('..\\xml\\product\\en\\',translate(string(name),'&lt;/&gt; ',''),'.xml')">

The above is the XSL code I have writter to split the XML into multlpe XMLs. As you can see I am using hte translate method to subtitute '/','<','>' with '' from the name. I was hoping I could do the same with ™ (TM) and ® (R) but it doesnt seem to work. Please advice me how I would be able to do that.

Thanks for you help in advance.

Upvotes: 2

Views: 4553

Answers (2)

user357812
user357812

Reputation:

Following Dimitre answer, I think that if you are not sure about wich special character could be in name, maybe you should keep what you consider legal document's name characters.

As example:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="text()">
   <xsl:value-of select="translate(.,
                                   translate(.,
                                             'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ',
                                             ''),
                                   '')"/>
 </xsl:template>
</xsl:stylesheet> 

With input:

<t>XXX™ My > Trademark®</t>

Result:

XXX My  Trademark

Upvotes: 2

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243569

I don't have Xalan, but with 8 other XSLT processors this thransformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="text()">
   <xsl:value-of select="translate(., '&lt;/&gt;™®', '')"/>
   ===================
   <xsl:value-of select="translate(., '&lt;/&gt;&#x2122;&#xAE;', '')"/>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<t>XXX™ My Trademark®</t>

produces the wanted result:

XXX My Trademark
   ===================
   XXX My Trademark

I suggest that you try to use one of the two expressions above -- at least the second may work successfully.

Upvotes: 3

Related Questions