User27854
User27854

Reputation: 884

How to avoid repeated parsing of XML special characters in JAVA

I am trying to escape XML special characters in a String. The escaping is taken care by a static method as shown below.

public static String escapeXml10(String response) {
    return StringEscapeUtils.escapeXml10(response);
}

Now the issue with such an implementation is that, I get a piece of string which may or may not be parsed. which leads to irregular outputs.

for eg:

  1. & -> &amp (This string is not parsed follows a different flow)
  2. & -> & (This string is parsed once by a flow and now by my code).

Now To get a proper response I am planning to introduce a check in the static metod. as follows by using if Condition.

public static String escapeXml10(String response) {
    if(response.contains("&")  ||
            response.contains("<")   ||
            response.contains(">")   ||
            response.contains("'") ||
            response.contains(""")){
             return response;
         }else{
             return StringEscapeUtils.escapeXml10(response);    
         }
}

is this a correct way of implementing, if not please suggest?

Upvotes: 1

Views: 797

Answers (1)

Little Santi
Little Santi

Reputation: 8783

  • 1st: What you are doing is escapping, not parsing.
  • 2nd: @JBNizet is right: You have a design issue here. You must know what kind of data you should receive in your input parameters: Weteher it is a escapped, valid XML or an unescapped XML.
  • 3rd: As a general rule, all user data should be processed by the program in its plain -unformatted- form, and with the most specific datatypes: int for integer numbers, float or double for decimals, String for texts, etc. Then, a proper format should be done just at serializing that data. For example, before serializing to XML, nodes and attributes must be properly placed to form a specific data structure, and user data must be escapped to avoid occurrences of special characters within. Conversely, at the time of reading an XML (=parsing), an unescapping must be done (but this is already done by the parser).

Conclussion: You shouldn't even care by escapping if you use standard XML parsers (DocumentBuilderFactory, SAXParser, XMLInputFactory) and serializers (TransformerFactory, XMLOutputFactory). Neither should care your client apps.

Upvotes: 1

Related Questions