neptune
neptune

Reputation: 1420

How to unescape non-standard characters in XML in Java?

I realize a similar question has been asked before, and the solution is to use StringEscapeUtils.unescape(). However, per the method description:

Supports only the five basic XML entities (gt, lt, quot, amp, apos). Does not support DTDs or external entities.

I have a bunch of XML files with escaped characters like ␣ and &hyph;. How can I unescape these? They are defined in the DTD provided. Is there a method like StringEscapeUtils but one with DTD support?

Upvotes: 2

Views: 1046

Answers (1)

Nathan Ryan
Nathan Ryan

Reputation: 13041

Hmm, it's been a long time, but I think an implementation of EntityResolver2 (Java SDK) handles externally defined entities. This is part of the SAX2 specification.

Upvotes: 0

Related Questions