AbGator
AbGator

Reputation: 59

Reading file to String in Java results in invisible characters

I'm having trouble around reading from a text file into a String in Java. I have a text file (created in Eclipse, if that matters) that contains a short amount of text -- approximately 98 characters. Reading that file to a String via several methods results in a String that is quite a bit longer -- 1621 characters. All but the relevant 98 are invisible in the debugger/console.

I've tried the following methods to load the String:

apache commons-io:

FileUtils.readFileToString(new File(path));

FileUtils.readFileToString(new File(path), "UTF-8");

byte[] b = FileUtils.readFileToByteArray(new File(path);
new String(b, "UTF-8");

byte[] b = FileUtils.readFileToByteArray(new File(path);
Charset.defaultCharset().decode(ByteBuffer.wrap(bytes)).toString();

NIO:

new String(Files.readAllBytes(path);

And so on.

Is there a method to strip away these control chars? Is there a way to read files to strings that doesn't have this issue?


As noted in the comments below, this behavior is due to a corrupted(?) file generated by Eclipse. I'd still be interested in hearing any strategies for trimming away control characters from Strings, though!

Upvotes: 0

Views: 4048

Answers (2)

Bohemian
Bohemian

Reputation: 424973

If you want to strip out all non-printable characters, try this

str = str.replaceAll("[^\\p{Graph}\n\r\t ]", "");

The regex matches all "invisible" characters, except ones we want to keep; in this case newline chars, tabs and spaces.

\p{Graph} is a POSIX character class for all printable/visible characters. To negate a POSIX character class, we can use capital P, ie P{Graph} (all non-printable/invisible characters), however we need to not exclude newlines etc, so we need [^\\p{Graph}\n\r\t] .

Upvotes: 4

barak manos
barak manos

Reputation: 30126

Read it line by line into a StringBuilder, and then convert it to a String:

StringBuilder sb = new StringBuilder();
BufferedReader file = new BufferedReader(new FileReader(fileName));
while (true)
{
    String line = file.readLine();
    if (line == null)
        break;
    sb.append(line+"\n");
}
file.close();
return sb.toString();

Upvotes: 0

Related Questions