Cratylus
Cratylus

Reputation: 54074

Java support for conditional lookahead

In the following let's say zip codes I am trying to exclude the 33333- from the result.
I do:

String zip = "11111 22222 33333- 44444-4444";
String regex = "\\d{5}(?(?=-)-\\d{4})";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(zip);
while (matcher.find()) { 
   System.out.println(" Found: " + matcher.group());     
}

Expect to get:

Found:  11111  
Found:  22222  
Found:  44444-4444

I am trying to enforce format of:
5 digits optionally followed by a - and 4 digits. 5 digits with just a - (hyphen) is not wanted

I get exception:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown inline modifier near index 7
\d{5}(?(?=-)(-\d{4}))
       ^
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.group0(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.<init>(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)

Am I not using the conditional lookahead correctly?

Upvotes: 4

Views: 5981

Answers (4)

Kleenestar
Kleenestar

Reputation: 799

(\d{5}(?!-\s)(?:-\d{4})?)

hence:

String regex = "(\\d{5}(?!-\\s)(?:-\\d{4})?)";`

Upvotes: 0

anubhava
anubhava

Reputation: 784958

To capture all numbers except 33333 use this code:

String zip = "11111 22222 33333- 44444-4444";
String regex = "\\d{5}(?=(-\\d{4}|\\s|$))(-\\d{4})?";
Matcher m = Pattern.compile(regex).matcher(zip);
while(m.find())
    System.out.printf("Macthed: [%s]%n", m.group(1));

OUTPUT:

Macthed: [11111]
Macthed: [22222]
Macthed: [44444-4444]

Explanation: This RegEx is using lookahead that itself is like a condition, which means match 5 digit number which must be followed by - and 4 digits OR a space OR end of string and then it is optionally matching a text - and 4 digits.

The reason why your original RegEx is throwing exception because there is a syntax error in ?:(?=-) part of your RegEx.

Upvotes: 6

RanRag
RanRag

Reputation: 49547

Your question is a little unclear to me. I suppose you are looking for:

String st = "11111 22222 33333- 44444-4444";
String pattern = "\\d+(- )";
String res  = st.replaceAll(pattern,"");
System.out.println(res);

Output = 11111 22222 44444-4444

Upvotes: 0

Thomas
Thomas

Reputation: 88707

You'r missing a colon after (?, i.e. use this regex (non-Java-String): \d{5}(?:(?=-)-\d{4}).

However, this might still not produce the result you want. Please post some example input and required output.

Upvotes: 0

Related Questions