Reputation: 21247
I'm trying to extract a sequence of unpredictable text from the middle of a formatted string. Here is a an example of what my string might look like:
THIS PART NEVER CHANGES
Payload
UppErAndLowerCaseLetters
andDigitsNotPredictable
ButDoesIncludeLineBreaks
OtherStuffThatIDon'tWant
Note that there are line breaks here that must be preserved. In this example, I want to capture in a String variable this text:
Payload
UppErAndLowerCaseLetters
andDigitsNotPredictable
ButDoesIncludeLineBreaks
So, my "delimiters" are the header part THIS PART NEVER CHANGES
at the beginning and the double line break at the end. That's the tricky part. How do I write my regular expression to identify a double line break, but exclude a single line break? Here is what I have:
String payload = "THIS PART NEVER CHANGES" +
System.getProperty("line.separator") +
"(.+?)" +
System.getProperty("line.separator") +
System.getProperty("line.separator");
BufferedFileReader bfr = new BufferedFileReader();
String file_contents = bfr.readFileToString(myFile);
Pattern pattern = Pattern.compile(payload);
Matcher matcher = pattern.matcher(file_contents);
while (matcher.find())
System.out.println(matcher.group());
This almost works. If I take out the last System.getProperty("line.separator")
from the payload string, I get the first line from the payload. When I leave it in, I get nothing.
Can anyone tell me what I am doing wrong? Thanks!
Upvotes: 1
Views: 2478
Reputation: 120586
The regex
(?m:^(?=[\r\n]|\z))
will match a blank line because m
causes ^
to match at the beginning of the line instead of the beginning of input, and (?=[\r\n]|\z)
looks ahead to a newline or end of input.
As to the root cause of your problem, Reimeus is right about DOTALL.
Upvotes: 3
Reputation: 48444
Why don't you use a specific quantifier for your line break?
For instance:
Pattern p = Pattern.compile("\n{2,}");
String line = "\n\n";
System.out.println(p.matcher(line).find());
Output
true
If you want to use the escaped representation of your system line separator (instead of manually add the escaped String
, whether \n
or \r\n
, take a look at this SO thread.
Upvotes: 1
Reputation: 159864
You need to use the DOTALL
flag to match the newline characters
Pattern pattern = Pattern.compile(payload, Pattern.DOTALL);
Upvotes: 4