Reputation: 14317
I am reading a document which may contain XML entities like  
.
Since I need to export txt file, I manually have to convert the entities from XML to text.
As you can see below.
reader = new BufferedReader(new InputStreamReader(is, "utf-8"));
while ((s = reader.readLine()) != null) {
if (s.equals(" "))
s= " ";
}
Since there are many xml entities, and I want to convert them all to text like  
->space, and prefer to avoid if then, is there a generic way to do it?
Upvotes: 0
Views: 674
Reputation: 43867
I believe what you're talking about is called HTML (not XML) decoding. There is a URLDecoder class which does this for URLs (which may be what you're decoding). There is also a more general class in Apache commons for HTML decoding (specified in this question).
Edit: I was unaware of the difference between HTML and XML escapes/entities, thanks for the clarification. It appears from this question that Apache commons has a library for decoding XML entities but the standard Java library does not.
Upvotes: 1
Reputation: 2354
When you extract the number from  
, you can do this:
(new String(new byte[]{(byte)160}, "ISO-8859-1")).
Here are the entity mappings: HTML ISO-8859-1 Reference
Upvotes: 2