jeojavi
jeojavi

Reputation: 886

Detect line breaks in a `char[]`

I use to employ the following method to detect if a character is a whitespace:

Character.isWhiteSpace(char character);

Now I need to detect all the variants of line breaks (\n, \r, etc.) for all platforms (Linux, Windows, Mac OSX, etc.). Is there any similar way to detect if a character is a line break? If there is not, how can I detect all the possible variants?


Edit from comments: As I didn't know that line breaks can be represented by several characters, I add some context to the question.

I'm implementing the write(char[] buffer, int offset, int length) method in a Writer (see Javadoc). In addition to other operations, I need to detect line breaks inside the buffer. I'm trying to avoid creating an String from the buffer to preserve memory, as I've seen that sometimes the buffer is too big (several MB).

Is there any way to detect line breaks without creating a String?

Upvotes: 12

Views: 15014

Answers (3)

You can get the OS dependent line separator using

System.getProperty("line.separator")

This will return a string.

But since your are trying use char, checking whether char is '\n' or 'r' is correct.

if(yourChar == '\r' || yourChar == '\n')

Upvotes: 0

Bohemian
Bohemian

Reputation: 424953

Use regex to do the work for you:

if (!String.valueOf(character).matches("."))

Without the DOTALL switch, the dot matches all characters except newlines, which according the documentation includes:

  • A newline (line feed) character ('\n'),
  • A carriage-return character followed immediately by a newline character ("\r\n"),
  • A standalone carriage-return character ('\r'),
  • A next-line character ('\u0085'),
  • A line-separator character ('\u2028'), or
  • A paragraph-separator character ('\u2029).

Note that line break sequences exist, eg \r\n, but you asked about individual characters. The regex solution would work with one or two char inputs.

Upvotes: 13

Martin
Martin

Reputation: 3058

As I posted in my comments, the line separator is not always a "character", but a sequence of characters, depending on the platform. To be independent it would look like this:

public String[] splitLines(String input) {
    return input.split("(\r\n|\r|\n)");
}

Based on this answer:

Match linebreaks - \n or \r\n?

However, this means regex matching, not char matching... However getting a String out of a buffer should be achievable...

Upvotes: 1

Related Questions