Reputation: 2430
I have written a C# application that loads XML files, parses them and uses the information to run SQL queries and send the results to email distribution lists.
These XML files are usually created by END users.
Currently I have them replacing > and < with > and < in the SQL, of course being END users they sometime forget. In fact they ALWAYS forget. I'd prefer to keep the query in an XML file. So, is there ANY way to force/allow the use of these special characters in XML files?
Right now my user must type this:
<?xml version="1.0" encoding="utf-8" ?>
<report>
<queries>
<query>
SELECT * FROM THETABLE WHERE THEVALUE > 100
</query>
</queries>
</report>
I'd like them to be able to type this:
<?xml version="1.0" encoding="utf-8" ?>
<report>
<queries>
<query>
SELECT * FROM THETABLE WHERE THEVALUE > 100
</query>
</queries>
</report>
Upvotes: 1
Views: 133
Reputation: 39055
You can preprocess the file with a regular expression which looks for < and > that doesn't belong to a tag, and replace them accordingly.
You can use this regex:
(?sx)
\s*
(?:<\?.*?\?>)(?:\s*)
(?:
(?:<[^\s]*?>)\s*
|(?:[^<>]*\s)
|(?<lt><)
|(?<gt>>)
)*
\s*
(Be aware that you must use single line and ignore whitespace options, as stablished by (?sx)
.
This expression captures or the less than and greater than symbols which doesn't belong to the tags in the lt
and gt
groups.
You can replace the matches.
If you want to know how it works, this captures everything in named groups:
(?sx)
\s*
(?<head><\?.*?\?>)(?:\s*)
(?:
(?<tag><[^\s]*?>)\s*
|(?<others>[^<>]*\s)
|(?<lt><)
|(?<gt>>)
)*
\s*
Upvotes: 0
Reputation: 16623
Use CDATA
. So:
<query><![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]></query>
The text inside a CDATA section is ignored by the parser.
Upvotes: 1
Reputation: 2458
You would need to surround the text with CDATA so it looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<report>
<queries>
<query>
<![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]>
</query>
</queries>
</report>
This tells the parser that everything between should be treated as text and should not be interpreted.
Upvotes: 1
Reputation: 20693
Use CDATA, text inside CDATA is not parsed, something like this :
<query><![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]></query>
Upvotes: 1
Reputation: 726987
You can wrap your queries in CDATA
:
<?xml version="1.0" encoding="utf-8" ?>
<report>
<queries>
<query><![CDATA[
SELECT * FROM THETABLE WHERE THEVALUE > 100
]]></query>
</queries>
</report>
Upvotes: 4