carny666
carny666

Reputation: 2430

Is there any way to FORCIBLY allow the usage of '<' and/or '>' in XML files?

I have written a C# application that loads XML files, parses them and uses the information to run SQL queries and send the results to email distribution lists.

These XML files are usually created by END users.

Currently I have them replacing > and < with > and < in the SQL, of course being END users they sometime forget. In fact they ALWAYS forget. I'd prefer to keep the query in an XML file. So, is there ANY way to force/allow the use of these special characters in XML files?

Right now my user must type this:

<?xml version="1.0" encoding="utf-8" ?>
<report>
  <queries>
    <query>
       SELECT * FROM THETABLE WHERE THEVALUE &gt; 100
    </query>
  </queries>
</report>

I'd like them to be able to type this:

<?xml version="1.0" encoding="utf-8" ?>
<report>
  <queries>
    <query>
       SELECT * FROM THETABLE WHERE THEVALUE > 100
    </query>
  </queries>
</report>

Upvotes: 1

Views: 133

Answers (5)

JotaBe
JotaBe

Reputation: 39055

You can preprocess the file with a regular expression which looks for < and > that doesn't belong to a tag, and replace them accordingly.

You can use this regex:

    (?sx)
    \s*
    (?:<\?.*?\?>)(?:\s*)
    (?:
     (?:<[^\s]*?>)\s*
     |(?:[^<>]*\s)
     |(?<lt><)
     |(?<gt>>)
    )*
    \s*

(Be aware that you must use single line and ignore whitespace options, as stablished by (?sx).

This expression captures or the less than and greater than symbols which doesn't belong to the tags in the lt and gt groups.

You can replace the matches.

If you want to know how it works, this captures everything in named groups:

    (?sx)
    \s*
    (?<head><\?.*?\?>)(?:\s*)
    (?:
     (?<tag><[^\s]*?>)\s*
     |(?<others>[^<>]*\s)
     |(?<lt><)
     |(?<gt>>)
    )*
    \s*

Upvotes: 0

Omar
Omar

Reputation: 16623

Use CDATA. So:

<query><![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]></query>

The text inside a CDATA section is ignored by the parser.

Upvotes: 1

kufi
kufi

Reputation: 2458

You would need to surround the text with CDATA so it looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<report>
  <queries>
    <query>
       <![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]>
    </query>
  </queries>
</report>

This tells the parser that everything between should be treated as text and should not be interpreted.

Upvotes: 1

Antonio Bakula
Antonio Bakula

Reputation: 20693

Use CDATA, text inside CDATA is not parsed, something like this :

<query><![CDATA[SELECT * FROM THETABLE WHERE THEVALUE > 100]]></query>

Upvotes: 1

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726987

You can wrap your queries in CDATA:

<?xml version="1.0" encoding="utf-8" ?>
<report>
  <queries>
    <query><![CDATA[
       SELECT * FROM THETABLE WHERE THEVALUE > 100
    ]]></query>
  </queries>
</report>

Upvotes: 4

Related Questions