Match text between empty lines

I have a text in following blocks:

AAAAAAA
BBBBBBB
CCCCCCC

DDDDDD.    YYYYYYYYYYYYYYYYYYYYYY                
EEEEE 1234567890                              

Some random text
Some text random
Random text
Text 
Some random text

ZZZZZZZZZZZZZZZZ
UUUUUUUUUUUUUUUU

How to select with regexp a following block?

Some random text
Some text random
Random text
Text 
Some random text

From the original text I know that this block goes after line DDDDDD. YYYYYYYYYYYYYYYYYYYYYY which is optionally followed by line EEEEE 1234567890 and also that block is between lines that contain only \s symbols.

I have tried pattern DDDDDD.*\\s+(.*)\\s+ it doesn't work.

Upvotes: 1

Views: 1507

Answers (1)

Mena
Mena

Reputation: 48404

You can use the following Pattern to match your expected text:

String text = "AAAAAAA\nBBBBBBB\nCCCCCCC\n\nDDDDDD.    YYYYYYYYYYYYYYYYYYYYYY                "
    + "\nEEEEE 1234567890                              "
    + "\n\nSome random text\nSome text random\nRandom text\nText \nSome random text\n\n"
    + "ZZZZZZZZZZZZZZZZ\nUUUUUUUUUUUUUUUU";
Pattern p = Pattern.compile(
 // | 6 "D"s
 // |    | actual dot
 // |    |  | some whitespace
 // |    |  |   | 22 "Y"s
 // |    |  |   |    | more whitespace
 // |    |  |   |    |   | optional: 
 // |    |  |   |    |   || 5 "E"s
 // |    |  |   |    |   ||   | whitespace
 // |    |  |   |    |   ||   |  | 10 digits
 // |    |  |   |    |   ||   |  |      | more whitespace including line breaks
 // |    |  |   |    |   ||   |  |      |      | your text
 // |    |  |   |    |   ||   |  |      |      |    | followed by any "Z" sequence
    "D{6}\\.\\s+Y{22}\\s+(E{5}\\s\\d{10}\\s+)?(.+?)(?=Z+)", 
    Pattern.DOTALL
);
Matcher m = p.matcher(text);
if (m.find()) {
    System.out.println(m.group(2));
}

Output

Some random text
Some text random
Random text
Text 
Some random text

Note

Not sure how to delimit the final part, so I just used a capitalized Z sequence (1+).

Up to you to refine.

Upvotes: 2

Related Questions