PSSGCSim
PSSGCSim

Reputation: 1287

Equivalent GetBytes function in Java like c#

I have problem with converting string to bytes in Java when I'm porting my C# library to it. It converts the string but it is not the same byte array.

I use this code in C#

string input = "Test ěščřžýáíé 1234";
Encoding encoding = Encoding.UTF8;
byte[] data = encoding.GetBytes(input);

And code in Java

String input = "Test ěščřžýáíé 1234";
String encoding = "UTF8";
byte[] data = input.getBytes(encoding);

Lwft one is Java output and right one is C# how to make Java output same as C# one ?

enter image description here

Upvotes: 1

Views: 1358

Answers (2)

Douglas
Douglas

Reputation: 54907

In likelihood, the byte arrays are the same. However, if you're formatting them to a string representation (e.g. to view through a debugger), then they would appear different, since the byte data type is treated as unsigned in C# (having values 0255) but signed in Java (values -128127). Refer to this question and my answer for an explanation.

Edit: Based on this answer, you can print unsigned values in Java using:

byte b = -60;
System.out.println((short)(b & 0xFF));   // output: 196

Upvotes: 3

fge
fge

Reputation: 121820

These arrays are very probably the same.

You are hit by a big difference between C# and Java: in Java, byte is unsigned.

In order to dump, try this:

public void dumpBytesToStdout(final byte[] array)
{
    for (final byte b: array)
        System.out.printf("%02X\n", b);
}

And do an equivalent dump method in C# (no idea how, I don't do C#)

Alternatively, if your dump function involves integer types larger than byte, for instance an int, do:

i & 0xff

to remove the sign bits. Note that if you cast byte -1, which reads:

1111 1111

to an int, this will NOT give:

0000 0000 0000 0000 0000 0000 1111 1111

but:

1111 1111 1111 1111 1111 1111 1111 1111

ie, the sign bit is "carried" (otherwise, casting would yield int value 255, which is not -1)

Upvotes: 2

Related Questions