Barun
Barun

Reputation: 1915

java unicode encoded file reading problem in jdk 1.3

I am using jdk1.3 for blackberry platform. Now I am facing a problem when I trying to read an Unicode encoded xml file.

My code :

java.io.BufferedReader br = new java.io.BufferedReader(new java.io.InputStreamReader(new java.io.FileInputStream(path),"UTF16"));
br.readLine();

Error:

sun.io.MalformedInputException: Missing byte-order mark
    at sun.io.ByteToCharUnicode.convert(ByteToCharUnicode.java:123)
    at java.io.InputStreamReader.convertInto(InputStreamReader.java:137)
    at java.io.InputStreamReader.fill(InputStreamReader.java:186)
    at java.io.InputStreamReader.read(InputStreamReader.java:249)
    at java.io.BufferedReader.fill(BufferedReader.java:139)
    at java.io.BufferedReader.readLine(BufferedReader.java:299)
    at java.io.BufferedReader.readLine(BufferedReader.java:362)

Thanks

Upvotes: 0

Views: 1133

Answers (2)

ilalex
ilalex

Reputation: 3078

Try this code:

java.io.BufferedReader br = new java.io.BufferedReader(new java.io.InputStreamReader(new java.io.FileInputStream(path),"Windows-1256"));
br.readLine();

Upvotes: 0

Mat
Mat

Reputation: 206699

You XML file is missing a byte order mark.

In JDK 1.3, the byte order mark is mandatory if you use UTF-16. Try the UTF16-LE or -BE if you know in advance what the endianness is.

(The BOM is not mandatory in 1.4.2 and above.)

Of course, if your file is not UTF-16 at all, use the correct encoding. See the above link to character encodings. The actual encodings supported, apart from a small set of core encodings, are implementation defined so you'll need to check the docs for your particular JDK.

The encoding the files are in is supposed to be in the <xml> header of your files, something like:

<?xml version="1.0" encoding="THIS IS THE ENCODING YOU NEED TO USE"?>

If the file is in a single character encoding, or UTF-8 (without a BOM), You can try reading the first line with plain US-ASCII, it shouldn't contain any data outside that range. Parse the encoding field, then re-open the file with the deduced encoding.

This will only work if the actual encoding is supported by your platform obviously.

BTW: JDK 1.3 is ancient. Are you sure that's your version? (Doesn't change anything to the problem anyway except for the BOM part)

Upvotes: 2

Related Questions