Reputation: 230
I have a problem in writing a xml file with UTF-8 in JAVA. Problem: I have a file with filename having an interpunct(middot)(·) in it. When im trying to write the filename inside a xml tag, using java code i get some junk number like in filename instead of ·
OutputStreamWriter osw =new OutputStreamWriter(file_output_stream,"UTF8");
Above is the java code i used to write the xmlfile. Can anybody tell me why to understand and sort the problem ? thanks in advance
Upvotes: 0
Views: 2862
Reputation: 46876
Java sources are UTF-16 by default. If your character is not in it, then use an escape:
String a = "\u00b7";
Or tell your compiler to use UTF-8 and simply write it to the code as-is.
Upvotes: 2
Reputation: 425198
That character is ASCII 183 (decimal), so you need to escape the character to ·
. Here is a demonstration: If I type "·"
into this answer, I get "·"
The browser is printing your character because this web page is XML.
There are utility methods that can do this for you, such as apache commons-lang library's StringEscapeUtils.escapeXml()
method, which will correctly and safely escape the entire input.
Upvotes: 1
Reputation: 269797
Don't try to create XML by hand. Use a library for the purpose. You are just scratching the surface of the heap of special cases that will break a hand-made solution.
One way, using core Java classes, is to create a DOM, then serialize that using an no-op XSL transform that writes to a StreamResult
. (if your document is large, you can do something similar by driving a SAX event handler.)
There are many third party libraries that will help you do the same thing very easily.
Upvotes: 0
Reputation: 109597
In general it is a good idea to use UTF-8 everywhere.
The editor has to know that the source is in UTF-8. You could use the free programmers editor JEdit which can deal with many encodings.
The javac compiler has to know that the java source is in UTF-8. In Java you can use the solution of @OndraŽižka.
This makes for two settings in your IDE.
Upvotes: 0