denBelg
denBelg

Reputation: 343

byte[] to string and back to byte[]

I have a problem with interpreting a file. The file is builded as follow:

"name"-@-"date"-@-"author"-@-"signature"

The signature is a byte array. When i read the file back in i parse it to String en split it:

myFileInpuStream.read(fileContent);    
String[] data = new String(fileContent).split("-@-");

If i look at the var fileContent i see that the bytes are al good. But when i try to get the signature byte array:

byte[] signature=  data[3].getBytes();

Sometimes i get wrong values of 63. I tried a few solutions with:

new String(fileContent, "UTF-8")

But no luck. Can someone help? The signature is not a fixed length thus i can not do it hard coded...

Some extra info:

Original signature:

[48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]

filecontent(var after reading):

... 48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]

signature (after split and getBytes()):

[48, 45, 2, 21, 0, -123, -3, -5, 63, 84, -86, 26, -124, 63, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]

Upvotes: 2

Views: 2877

Answers (4)

Daniel A.A. Pelsmaeker
Daniel A.A. Pelsmaeker

Reputation: 50316

Edit: I think I finally understand what you are doing.

You have four parts: name, date, author, signature. The name and author are strings, the date is a date and the signature is a hashed or encrypted array of bytes. You want to store them as text in a file, separated by -@-. To do this, you first need to convert each to a valid string. Name and author are already strings. Converting a date to string is easy. Converting an array of bytes to string is not easy.

You can use base64 encoding to convert a byte array to a string. Use javax.xml.bind.DatatypeConverter printBase64Binary() for encoding and javax.xml.bind.DatatypeConverter parseBase64Binary() for decoding.

For example, if you have a name denBelg, date 2013-03-19, author Virtlink and this signature:

30 2D 02 15 00 85 FD FB 8D 54 AA 1A 84 90 4B F6 FF C8 28 0D D2 06 78 C8 64 02 14
 42 A4 F8 30 A8 65 39 38 14 7D E0 CF 85 49 60 4C AE 51 33 45

Then, after concatenation and base64 encoding of the signature, the resulting string became, for example:

denBelg-@-20130319-@-Virtlink-@-MC0CFQCF/fuNVKoahJBL9v/IKA3SBnjIZAIUQqT4MKhlOTgUfeDPhUlgTK5RM0U=

Later, when you split the string on -@- you can decode the base64 signature part and get back an array of bytes.

Note that when the name or author can include -@- in their name, they can mess up your code. For example, if I set name as den-@-Belg then your code would fail.


Original post:

Java's String.getBytes() uses the platform default encoding for the string. Encoding is the way string characters are mapped to bytes values. So, depending on the platform the resulting bytes may be different.

Fix the encoding to UTF-8 and read it with the same encoding, and your problems will go away.

byte[] signature = data[3].getBytes("UTF-8");

String sigdata = new String(signature, "UTF-8");

0-???����T�?��K���( �?x�d??B��0�e98?}�υI`L�Q3E

Your example represents some garbled mess of characters (is it encrypted or something?), but the bytes you highlighted show the problem:

You start with a byte value of -115. The minus indicates it is a byte value above 0x7F, whose character representation highly depends on the encoding used. Let's assume extended US-ASCII, then your byte represents (according to this table) the character ì (with an accent). Now when you decode it the decoder (depending on the encoding you use) might not understand the byte value 0x8D and instead represents it with a question mark ?. Note that the question mark is US-ASCII character 63, and that's where your 63 came from.

So make sure you use your encodings consistently and don't rely on the system's default.


Also, never use string encoding to decode byte arrays that do not represent strings (e.g. hashes or other cryptographic content).

According to your comment you are trying to read encrypted data (which are bytes) and converting them to a string using a decoder? That will never work in any way you expect it to. After you've encrypted something you have an array of bytes which you should store as-is. When you read them back, you have to put the bytes through a decrypter to regain the unencrypted bytes. Only if those decrypted bytes represent a string, then you can use an encoding to decode the string.

Upvotes: 2

Joeri Hendrickx
Joeri Hendrickx

Reputation: 17435

Sounds like an encoding issue to me.

First you need to know what encoding your file is using, and use that when reading the file.

Secondly, you say you signature is a byte array, but java strings are always unicode. If you want a different encoding (I'm guessing you want ASCII), you need to do getBytes("US-ASCII").

Of course, if your input was ascii, it would be strange that this could cause encoding issues.

Upvotes: 0

Nathaniel Waisbrot
Nathaniel Waisbrot

Reputation: 24483

You're making extra work for yourself by converting these bytes into Strings by hand. Why aren't you doing it using the classes intended for this?

// get the file /logs/access.log
Path path = FileSystems.getRoot().getPath("logs", "access.log");
// open it, decoding UTF-8
BufferReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
// read a line of text, properly decoded
String line = reader.readLine();

Or, if you're in Java 6:

BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("/logs/access.log"), "UTF-8"));
String line = reader.readLine();

Links:

Upvotes: 0

Michaël
Michaël

Reputation: 3671

You can't access data[4] because you have 4 String in your table. So you can access data from 0 to 3.

data[0] = name

data[1] = date

data[2] = author

data[3] = signature

The solution :

byte[] signature = data[3].getBytes();

Upvotes: 3

Related Questions