Reputation: 21
I have a byte array bytes
of UTF-8 encoded strings which I want to convert to a String.
bytes.length is about 130000
String str = new String(bytes, StandardCharsets.UTF_8);
should do the job.
However str gets the value '<Unreadable>'
Converting bytes line by line and printing it out works nicely.
However appending the lines in a StringBuilder fails as well. Again the content of the StringBuilder r will be '<Unreadable>'.
So I thought there might be an unreadable byte in the array.
But r.substring(60000, r.count)
works well, and r.substring(1,60000)
, too.
Is there any problem with the size of the byte array?? Maximum size of String/StringBuilder is 2^32 - 1 so there should be no problem.
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
InputStreamReader reader = new InputStreamReader(bais);
BufferedReader in = new BufferedReader(reader);
// String readBuf = in.lines().collect(Collectors.joining()); gives '<Unreadable>'
String readed;
StringBuilder r = new StringBuilder();
while ((readed = in. readLine()) != null) {
System.out.println(readed); // works fine
r=r.append(readed);
}
After the loop r.toString() is '<Unreadable>' Any ideas why I cannot convert the byte array to a String/StringBuilder?
Upvotes: 2
Views: 537
Reputation: 718768
I tried to reproduce this behavior, and I have not been able to. We need a proper minimal reproducible example to make any real progress on this. And full details of the Java version and vendor, and any other tools that may be implicated.
However I do have one definite thing to report. I have copies of the OpenJDK source code for Java 6, 7, 8, 11 and 17 in a searchable form. When I search the source code for Unreadable
, NONE of the hits I get are relevant. (Indeed, they are all in the respective test
trees!) This is very odd.
My tentative conclusion is that this <Unreadable>
string you are seeing is NOT coming from OpenJDK / Oracle Java. Either you are using a different vendor's Java, or it is coming from a tool such as your IDE.
UPDATE
Looking at the Netbeans code that @BrainStorm.exe found, the problem occurs because the debugger is getting unexpected (non-character) data from the debug agent while trying to retrieve the string's characters.
It is difficult to figure out what caused this, but I have a couple of ideas:
One possibility is that the String
object is being mutated or deleted in the target JVM. The former seems unlikely, and I would have thought that the String
would be reachable ... by virtual of the fact that the debug agent has a reference to it. (But maybe not.)
The second possibility is that there is a internal incompatibility between the representation of String
in the target JVM and Netbean's assumptions about it.
Netbeans seems to be assuming that the value
field of String
is a char[]
. Note that line 178 in the above it is expecting get(i)
to have returned a CharValue
; i.e. a JDI wrapper for a primitive char
.
If you look at the evolution of the String
class over time, the type of the private String.value
field changed from char[]
to byte[]
when they introduced compressed strings in Java 9.
So if Netbeans is expecting value
to be a char[]
when it is actually a byte[]
any "long" strings will be displayed as <Unreadable>
.
That is a Netbeans bug! And indeed there is an issue for it. And indeed someone else has come up with the same theory as I did.
Upvotes: 1
Reputation: 31
I had exactly the same. The String was "<Unreadable>"
if it was made from file bigger then 50000 bytes. But String.lenght()
showed 50000+ bytes! I found that the text "<Unreadable">
returned the IDE (Netbeans 11) in debug tools. So, the String was OK, but Netbeans didn't show the right content. It showed "<Unreadable>"
instead.
Upvotes: 1