Reputation: 343
I have a problem with interpreting a file. The file is builded as follow:
"name"-@-"date"-@-"author"-@-"signature"
The signature is a byte array. When i read the file back in i parse it to String en split it:
myFileInpuStream.read(fileContent);
String[] data = new String(fileContent).split("-@-");
If i look at the var fileContent i see that the bytes are al good. But when i try to get the signature byte array:
byte[] signature= data[3].getBytes();
Sometimes i get wrong values of 63. I tried a few solutions with:
new String(fileContent, "UTF-8")
But no luck. Can someone help? The signature is not a fixed length thus i can not do it hard coded...
Some extra info:
Original signature:
[48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]
filecontent(var after reading):
... 48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]
signature (after split and getBytes()):
[48, 45, 2, 21, 0, -123, -3, -5, 63, 84, -86, 26, -124, 63, 75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]
Upvotes: 2
Views: 2877
Reputation: 50316
Edit: I think I finally understand what you are doing.
You have four parts: name, date, author, signature. The name and author are strings, the date is a date and the signature is a hashed or encrypted array of bytes. You want to store them as text in a file, separated by -@-
. To do this, you first need to convert each to a valid string. Name and author are already strings. Converting a date to string is easy. Converting an array of bytes to string is not easy.
You can use base64 encoding to convert a byte array to a string. Use javax.xml.bind.DatatypeConverter printBase64Binary()
for encoding and javax.xml.bind.DatatypeConverter parseBase64Binary()
for decoding.
For example, if you have a name denBelg
, date 2013-03-19
, author Virtlink
and this signature:
30 2D 02 15 00 85 FD FB 8D 54 AA 1A 84 90 4B F6 FF C8 28 0D D2 06 78 C8 64 02 14 42 A4 F8 30 A8 65 39 38 14 7D E0 CF 85 49 60 4C AE 51 33 45
Then, after concatenation and base64 encoding of the signature, the resulting string became, for example:
denBelg-@-20130319-@-Virtlink-@-MC0CFQCF/fuNVKoahJBL9v/IKA3SBnjIZAIUQqT4MKhlOTgUfeDPhUlgTK5RM0U=
Later, when you split the string on -@-
you can decode the base64 signature part and get back an array of bytes.
Note that when the name or author can include -@-
in their name, they can mess up your code. For example, if I set name as den-@-Belg
then your code would fail.
Original post:
Java's String.getBytes()
uses the platform default encoding for the string. Encoding is the way string characters are mapped to bytes values. So, depending on the platform the resulting bytes may be different.
Fix the encoding to UTF-8
and read it with the same encoding, and your problems will go away.
byte[] signature = data[3].getBytes("UTF-8");
String sigdata = new String(signature, "UTF-8");
0-???����T�?��K���( �?x�d??B��0�e98?}�υI`L�Q3E
Your example represents some garbled mess of characters (is it encrypted or something?), but the bytes you highlighted show the problem:
You start with a byte value of -115. The minus indicates it is a byte value above 0x7F, whose character representation highly depends on the encoding used. Let's assume extended US-ASCII, then your byte represents (according to this table) the character ì
(with an accent). Now when you decode it the decoder (depending on the encoding you use) might not understand the byte value 0x8D and instead represents it with a question mark ?
. Note that the question mark is US-ASCII character 63, and that's where your 63 came from.
So make sure you use your encodings consistently and don't rely on the system's default.
Also, never use string encoding to decode byte arrays that do not represent strings (e.g. hashes or other cryptographic content).
According to your comment you are trying to read encrypted data (which are bytes) and converting them to a string using a decoder? That will never work in any way you expect it to. After you've encrypted something you have an array of bytes which you should store as-is. When you read them back, you have to put the bytes through a decrypter to regain the unencrypted bytes. Only if those decrypted bytes represent a string, then you can use an encoding to decode the string.
Upvotes: 2
Reputation: 17435
Sounds like an encoding issue to me.
First you need to know what encoding your file is using, and use that when reading the file.
Secondly, you say you signature is a byte array, but java strings are always unicode. If you want a different encoding (I'm guessing you want ASCII), you need to do getBytes("US-ASCII")
.
Of course, if your input was ascii, it would be strange that this could cause encoding issues.
Upvotes: 0
Reputation: 24483
You're making extra work for yourself by converting these bytes into Strings by hand. Why aren't you doing it using the classes intended for this?
// get the file /logs/access.log
Path path = FileSystems.getRoot().getPath("logs", "access.log");
// open it, decoding UTF-8
BufferReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
// read a line of text, properly decoded
String line = reader.readLine();
Or, if you're in Java 6:
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("/logs/access.log"), "UTF-8"));
String line = reader.readLine();
Links:
Upvotes: 0
Reputation: 3671
You can't access data[4]
because you have 4 String
in your table. So you can access data
from 0 to 3.
data[0] = name
data[1] = date
data[2] = author
data[3] = signature
The solution :
byte[] signature = data[3].getBytes();
Upvotes: 3