Reputation: 12387
I can't find the documentation that specifies how a Scanner treats newline patterns by default. I want to read a file line by line and have the scanner be able to handle \r, \n or \r\n line endings regardless of the system the program is actually running on.
If I declare a scanner like so:
Scanner scanner = new Scanner(reader);
what is the default behaviour? Will it handle all three kinds as described above or do I have to tell it explicitly to do it?
Upvotes: 2
Views: 10985
Reputation: 718678
It is not documented (in Java 1.6) but the JDK code uses this regex to match a line break:
"\r\n|[\n\r\u2028\u2029\u0085]"
Here's a link to the source code: http://cr.openjdk.java.net/~briangoetz/7012540/webrev/src/share/classes/java/util/Scanner.java.html
IMO, this ought to be specified, since Scanner
's behavior wrt to line separators is different to (for example) BufferedReader
's. (I've lodged a bug report ...)
Upvotes: 3
Reputation: 1521
Looking at the source code for Sun JDK 1.6, the pattern used is "\r\n|[\n\r\u2028\u2029\u0085]"
which says "\r\n" or any one of \r, \n or the unicode characters for "line separator", "paragraph separator", and "next line" respectively.
Upvotes: 5