AndroidDev
AndroidDev

Reputation: 21247

Java Regex to Match Blank Line

I'm trying to extract a sequence of unpredictable text from the middle of a formatted string. Here is a an example of what my string might look like:

THIS PART NEVER CHANGES
Payload
UppErAndLowerCaseLetters
andDigitsNotPredictable
ButDoesIncludeLineBreaks

OtherStuffThatIDon'tWant

Note that there are line breaks here that must be preserved. In this example, I want to capture in a String variable this text:

Payload
UppErAndLowerCaseLetters
andDigitsNotPredictable
ButDoesIncludeLineBreaks

So, my "delimiters" are the header part THIS PART NEVER CHANGES at the beginning and the double line break at the end. That's the tricky part. How do I write my regular expression to identify a double line break, but exclude a single line break? Here is what I have:

String payload = "THIS PART NEVER CHANGES" + 
        System.getProperty("line.separator") +
        "(.+?)" + 
        System.getProperty("line.separator") +
        System.getProperty("line.separator");

BufferedFileReader bfr = new BufferedFileReader();
String file_contents = bfr.readFileToString(myFile);

Pattern pattern = Pattern.compile(payload);
Matcher matcher = pattern.matcher(file_contents);

while (matcher.find()) 
    System.out.println(matcher.group());

This almost works. If I take out the last System.getProperty("line.separator") from the payload string, I get the first line from the payload. When I leave it in, I get nothing.

Can anyone tell me what I am doing wrong? Thanks!

Upvotes: 1

Views: 2478

Answers (3)

Mike Samuel
Mike Samuel

Reputation: 120586

The regex

(?m:^(?=[\r\n]|\z))

will match a blank line because m causes ^ to match at the beginning of the line instead of the beginning of input, and (?=[\r\n]|\z) looks ahead to a newline or end of input.

As to the root cause of your problem, Reimeus is right about DOTALL.

Upvotes: 3

Mena
Mena

Reputation: 48444

Why don't you use a specific quantifier for your line break?

For instance:

Pattern p = Pattern.compile("\n{2,}");
String line = "\n\n";
System.out.println(p.matcher(line).find());

Output

true

If you want to use the escaped representation of your system line separator (instead of manually add the escaped String, whether \n or \r\n, take a look at this SO thread.

Upvotes: 1

Reimeus
Reimeus

Reputation: 159864

You need to use the DOTALL flag to match the newline characters

Pattern pattern = Pattern.compile(payload, Pattern.DOTALL);

Upvotes: 4

Related Questions