dres
dres

Reputation: 499

Conditional regex to return empty string

I need help with a conditional regex expression What I need is: if the input contains a certain word return empty string, else extract the year

I have the regex

((?=AVOID))|(\d{4})(.*)

Examples: For input: TEST1 AVOID TEST2 2016 TEST3 TEST4 empty string is extracted which is correct For input: TEST1 TEST2 2016 TEST3 TEST4 2016 is extracted which is ok For input: TEST1 TEST2 2016 TEST3 TEST4 AVOID 2016 is extracted which is not ok because AVOID is at the end of the input

Any help?

Upvotes: 4

Views: 428

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

/^(?!.*\bAVOID\b).*?\b(\d{4})\b/

See the regex demo

This will extract the first 4-digit chunk that is a whole word if there is no AVOID as a whole word in a string.

If the 4 digits are always enclosed with spaces, use

/^(?!.*\bAVOID\b).*? (\d{4}) /

See an updated regex demo (when testing standalone strings, the space can be replaced with \s).

Pattern details:

  • ^ - start of string
  • (?!.*\bAVOID\b) - a negative lookahead that checks if the line has a whole word AVOID, and if yes, returns no match (use DOTALL modifier to match across lines)
  • .*? - 0+ any characters other than a newline (use DOTALL modifier to match across lines)
  • \b(\d{4})\b - any 4 digits that are a whole word.

NOTE: .*? will match as few characters as possible before the first 4-digit whole word chunk (because of the reluctant quantifier *?). If you need to get the last 4-digit whole word chunk, use a greedy counterpart .*. You may further experiment with specifying the context around \d{4}.

See Java demo

List<String> strs = Arrays.asList("TEST1 AVOID TEST2 2016 TEST3 TEST4", "TEST1 TEST2 2016 TEST3 TEST4", "TEST1 TEST2 2016 TEST3 TEST4 AVOID");
String pat = "^(?!.*\\bAVOID\\b).*?\\b(\\d{4})\\b";
for (String str : strs) {
    Matcher m = Pattern.compile(pat).matcher(str);  
    if (m.find()) {
        System.out.println(m.group(1));   // return m.group(1)
    } else {
        System.out.println("No match for " + str + " :("); // return "" here
    }
}

Results:

No match for TEST1 AVOID TEST2 2016 TEST3 TEST4 :(
2016
No match for TEST1 TEST2 2016 TEST3 TEST4 AVOID :(

Upvotes: 3

Related Questions