mbehzad
mbehzad

Reputation: 3878

Does an array of bytes with negative values lose information when converted to String?

I've got a code like this where in the encoding i convert the letters to bytes and then flip them with unary bitwise complement ~ at the end convert it to String.

After that i want to decrypt it with a similar method. The problem is that for two similar input Strings (but not the same) i get the same encoded String with the same hashcode.

Does the String(bytes) method lose the information because the bytes are negative or can i retrieve it somehow without changing my encryption part?

thanx

static String encrypt(String s){
        byte[] bytes=s.getBytes();
        byte[] enc=new byte[bytes.length];

        for (int i=0;i<bytes.length;i++){

            enc[i]=(byte) ~bytes[i];
        }


        return new String(enc);
    }

 static String decrypt(String s){
 ...

Upvotes: 2

Views: 2361

Answers (2)

Jon Skeet
Jon Skeet

Reputation: 1502786

You should never use new String(...) to encode arbitrary binary data. That's not what it's there for.

Additionally, you should only very rarely use the default platform encoding, which is what you get when you call String.getBytes() and new String(byte[]) without specifying an encoding.

In general, encryption converts binary data to binary data. The normal process of encrypting a string to a string is therefore:

  • Convert the string into bytes with a known encoding (e.g. UTF-8)
  • Encrypt the binary data
  • Convert the encrypted binary data back into a string using base64.

Base64 is used to encode arbitrary binary data as ASCII data in a lossless fashion. Decryption is just a matter of reversing the steps:

  • Convert the base64 text back to a byte array
  • Decrypt the byte array
  • Decode the decrypted byte array as a string using UTF-8

(Note that what you've got currently is not really encryption - it's obfuscation at best.)

Upvotes: 7

Joachim Sauer
Joachim Sauer

Reputation: 308159

Your effectively converting arbitrary byte data into a String.

That's not what that constructor is for.

The String constructor that takes a byte[] is meant to convert text in the platform default encoding into a String. Since what you have is not text, the behaviour will be "bad".

If, for example, your platform default encoding is a 8-bit encoding (such as ISO-8859-*), then you'll "only" get random characters.

If your platform default encoding is UTF-8 you'll probably get random characters and some replacement characters for malformed byte sequences.

To summarize: don't do that. I can't tell you what to do instead, since it's not obvious what you're trying to achieve.

Upvotes: 4

Related Questions