Dragon Creature
Dragon Creature

Reputation: 1995

Read string from binary file, different encodings

I'm trying to read a binary file in Java (android) that was created by a C# program however i have stumbled in to a problem. C# by default encode string in binary file by UTF-7, Java uses UTF-8. This of course mean that the string don't get loaded in properly.

I was wonder how to read the string as UTF-7 instead of UTF-8. I also noticed that i got a similar problem with floats. Does C# and Java handle them differently and if so how do i read it correctly in Java.

Edit: I'm using the BinaryWriter class in the C# program and the DataInputStream class in java.

Upvotes: 0

Views: 1008

Answers (1)

SLaks
SLaks

Reputation: 887837

C# uses UTF-8 encoding unless otherwise specified.

EDIT The documentation here is incorrect.
Looking at the source, BinaryWriter writes the string length as a 7-bit encoded integer, using the following code:

    protected void Write7BitEncodedInt(int value) {
        // Write out an int 7 bits at a time.  The high bit of the byte, 
        // when on, tells reader to continue reading more bytes. 
        uint v = (uint) value;   // support negative numbers
        while (v >= 0x80) { 
            Write((byte) (v | 0x80));
            v >>= 7;
        }
        Write((byte)v); 
    }

You will need to port this code to Java in order to find out how many bytes to read.

Upvotes: 1

Related Questions