nano_nano
nano_nano

Reputation: 12523

java regex to get all matches between a String and the next pipe

I have a String like that:

 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
 ECLONG_TEXT_INSIDE_THIS2|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS

I need all the words between EC and the first |.

This does not work for me:

lAllText = lData.split("EC(.*?)|");

what I have to change do manage that issue?

Thanks for your help

Stefan

Upvotes: 0

Views: 63

Answers (3)

Mena
Mena

Reputation: 48404

Use look-behind and look-ahead groups to match your Pattern.

Example

// assuming multi-line, but not relevant
String input = 
    "ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS\n"
    + "ECLONG_TEXT_INSIDE_THIS2|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS\n"
    + "ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS\n"
    + "ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS";
    //                            | look behind for "EC"
    //                            |     | match any+ character reluctantly
    //                            |     |  | look ahead for "|" (escaped)
    Pattern p = Pattern.compile("(?<=EC).+?(?=\\|)");
    Matcher m = p.matcher(input);
    while (m.find()) {
        System.out.println(m.group());
    }

Output

LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS2
LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS

Upvotes: 1

zx81
zx81

Reputation: 41838

To match all the tokens, you could do this:

Pattern regex = Pattern.compile("\\G_?([^|_]+)");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group(1));
} 

matchList:

ECLONG
TEXT
INSIDE
THIS

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below regex to get the value between EC and the first | symbol.

(?<=EC)[^\|]*

DEMO

Java regex would be,

"(?<=EC)[^\\|]*"

Explanation:

  • (?<=EC) A lookbehind is used to set the matching marker just after to the string EC.
  • [^\|]* Matches any character but not of | symbol zero or more times.

Upvotes: 2

Related Questions