Reputation: 59
I am looking for a way to test a given element in the source XML and remove characters that are not valid. Basically, I have a list of allowed characters and need a way to replace any not in that list. Can this be done in XSLT?
To clarify: I am using XSLT to process a Valid and Complete source XML file so that it can be sent to a consuming system.
The consuming system defines what characters are allowed in certain elements and will reject the XML payload if it contains characters that are not valid. For example: they have provided the following "rule" for valid characters for a specific field:
([0-9a-zA-Z/\-\?:\(\)\.,'\+ \r\n]+)
So what I am looking to do is replace any character that does not match the rule above with null. Right now the main cause of rejection is underscores in the field. I know I can use replace to remove that character but I was hoping to define a single replace rule that would replace any character that is not in the above rule.
Upvotes: 0
Views: 999
Reputation: 2183
You could use translate()
or replace()
(the latter is XSLT2 only), I suppose, but if the characters are invalid in the sense that the XML is no longer well-formed, then you can't use XSLT, as it requires at least a well-formed XML document.
Using translate()
, removing all characters except those specified in a list goes like this:
translate($string, translate($string,'0123456789',''),'')
The above will remove everything not in the set 0123456789.
The other answer shows a way of doing it using replace()
and a regular expression.
If you have control over whatever generates the XML, I would look there for a solution.
Upvotes: 2
Reputation: 26
You can use replace()
as hinted above. Using your regular expression for valid characters, you could try this:
replace($string,"[^0-9a-zA-Z/\-\?:\(\)\.,'\+ \r\n]+","")
You can see that your regular expression is almost as it was, except that ^
has been added to turn the set of valid characters to its complement.
Upvotes: 1