Anthony
Anthony

Reputation: 12387

Java Scanner newline recognition

I can't find the documentation that specifies how a Scanner treats newline patterns by default. I want to read a file line by line and have the scanner be able to handle \r, \n or \r\n line endings regardless of the system the program is actually running on.

If I declare a scanner like so:

Scanner scanner = new Scanner(reader);

what is the default behaviour? Will it handle all three kinds as described above or do I have to tell it explicitly to do it?

Upvotes: 2

Views: 10985

Answers (2)

Stephen C
Stephen C

Reputation: 718678

It is not documented (in Java 1.6) but the JDK code uses this regex to match a line break:

"\r\n|[\n\r\u2028\u2029\u0085]"

Here's a link to the source code: http://cr.openjdk.java.net/~briangoetz/7012540/webrev/src/share/classes/java/util/Scanner.java.html

IMO, this ought to be specified, since Scanner's behavior wrt to line separators is different to (for example) BufferedReader's. (I've lodged a bug report ...)

Upvotes: 3

David
David

Reputation: 1521

Looking at the source code for Sun JDK 1.6, the pattern used is "\r\n|[\n\r\u2028\u2029\u0085]"

which says "\r\n" or any one of \r, \n or the unicode characters for "line separator", "paragraph separator", and "next line" respectively.

Upvotes: 5

Related Questions