HoWei
HoWei

Reputation: 21

Java StringBuilder / String is '<Unreadable>'

I have a byte array bytes of UTF-8 encoded strings which I want to convert to a String. bytes.length is about 130000

String str = new String(bytes, StandardCharsets.UTF_8); should do the job. However str gets the value '<Unreadable>'

Converting bytes line by line and printing it out works nicely. However appending the lines in a StringBuilder fails as well. Again the content of the StringBuilder r will be '<Unreadable>'. So I thought there might be an unreadable byte in the array. But r.substring(60000, r.count) works well, and r.substring(1,60000), too. Is there any problem with the size of the byte array?? Maximum size of String/StringBuilder is 2^32 - 1 so there should be no problem.

      ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
      InputStreamReader reader = new InputStreamReader(bais);
      BufferedReader in = new BufferedReader(reader);
      // String readBuf = in.lines().collect(Collectors.joining()); gives '<Unreadable>'           
      String readed;
      StringBuilder r = new StringBuilder();
      while ((readed = in. readLine()) != null) {
          System.out.println(readed); // works fine
          r=r.append(readed);
      }

After the loop r.toString() is '<Unreadable>' Any ideas why I cannot convert the byte array to a String/StringBuilder?

Upvotes: 2

Views: 537

Answers (2)

Stephen C
Stephen C

Reputation: 718768

I tried to reproduce this behavior, and I have not been able to. We need a proper minimal reproducible example to make any real progress on this. And full details of the Java version and vendor, and any other tools that may be implicated.

However I do have one definite thing to report. I have copies of the OpenJDK source code for Java 6, 7, 8, 11 and 17 in a searchable form. When I search the source code for Unreadable, NONE of the hits I get are relevant. (Indeed, they are all in the respective test trees!) This is very odd.

My tentative conclusion is that this <Unreadable> string you are seeing is NOT coming from OpenJDK / Oracle Java. Either you are using a different vendor's Java, or it is coming from a tool such as your IDE.


UPDATE

Looking at the Netbeans code that @BrainStorm.exe found, the problem occurs because the debugger is getting unexpected (non-character) data from the debug agent while trying to retrieve the string's characters.

It is difficult to figure out what caused this, but I have a couple of ideas:

  • One possibility is that the String object is being mutated or deleted in the target JVM. The former seems unlikely, and I would have thought that the String would be reachable ... by virtual of the fact that the debug agent has a reference to it. (But maybe not.)

  • The second possibility is that there is a internal incompatibility between the representation of String in the target JVM and Netbean's assumptions about it.

    • Netbeans seems to be assuming that the value field of String is a char[]. Note that line 178 in the above it is expecting get(i) to have returned a CharValue; i.e. a JDI wrapper for a primitive char.

    • If you look at the evolution of the String class over time, the type of the private String.value field changed from char[] to byte[] when they introduced compressed strings in Java 9.

    So if Netbeans is expecting value to be a char[] when it is actually a byte[] any "long" strings will be displayed as <Unreadable>.

    That is a Netbeans bug! And indeed there is an issue for it. And indeed someone else has come up with the same theory as I did.

Upvotes: 1

Sergiusz Brzeziński
Sergiusz Brzeziński

Reputation: 31

I had exactly the same. The String was "<Unreadable>" if it was made from file bigger then 50000 bytes. But String.lenght() showed 50000+ bytes! I found that the text "<Unreadable"> returned the IDE (Netbeans 11) in debug tools. So, the String was OK, but Netbeans didn't show the right content. It showed "<Unreadable>" instead.

Upvotes: 1

Related Questions