Eddy
Eddy

Reputation: 133

Escape valid XML characters in Java

I have a String which is looking like this:

<tag1><tag2>Text</tag2> > AnotherText</tag1>

I am using XMLOutputFactoryImpl to parse that XML into a String, however, I need to get the single "greater than" (right before AnotherText) escaped, too even if it is legal to have it there.

Do you have any ideas how I need to configure my OutputFactory to get this working?

Upvotes: 0

Views: 497

Answers (3)

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 135992

Cannot reproduce your problem. Here is my code (I use default StAX from rt.jar):

    XMLOutputFactory of = XMLOutputFactory.newInstance();
    System.out.println(of.getClass());
    XMLStreamWriter ow = of.createXMLStreamWriter(System.out);
    ow.writeStartElement("tag1");
    ow.writeStartElement("tag2");
    ow.writeCharacters("Text");
    ow.writeEndElement();
    ow.writeCharacters("> AnotherText");
    ow.writeEndElement();
    ow.close();

output

<tag1><tag2>Text</tag2>&gt; AnotherText</tag1>

Upvotes: 1

topcat3
topcat3

Reputation: 2642

You can use apache common lang library to escape a string.

org.apache.commons.lang.StringEscapeUtils

String escapedXml = StringEscapeUtils.escapeXml("the data might contain & or ! or % or ' or # etc");

Updated answer:

The best solution is to fix the program generating your text input. The easiest such fix would involve an escape utility like the other answers suggested. If that's not an option, I'd use a regular expression like

</?[a-zA-Z]+ */?>

to match the expected tags, and then split the string up into tags (which you want to pass through unchanged) and text between tags (against which you want to apply an escape method.)

I wouldn't count on an XML parser to be able to do it for you because what you're dealing with isn't valid XML. It is possible for the existing lack of escaping to produce ambiguities, so you might not be able to do a perfect job either.

Upvotes: -1

Aravind Yarram
Aravind Yarram

Reputation: 80166

If you are using the XML api (DOM, StAX or JAXB) then the content will be escaped for you automatically. You can also use a CDATA section for this.

Upvotes: 1

Related Questions