timJIN520
timJIN520

Reputation: 309

Regex - matching a ZIP code out from a String sentence in JAVA

I'm having trouble to understand regular expression in JAVA. I'm trying to get a ZIP code from a List. I set my regex like this below.

\b([0-9]{5})(?:-[0-9]{4})?\b

List example:

    22193 
    22192-2222
    .22193
    hello this is .221938
    hello this is .22012
    hello this is 22193
    22193 hello
    221931
    22193.2222

When I used the .matcher(string) and looping the list above, I received this results below.

22193 -----MATCH
22192-2222------MATCH
.22193-----MATCH
hello this is .221938
hello this is .22012 ----MATCH
hello this is 22193---- MATCH
22193 hello  -----MATCH
221931
22193.2222 ---- MATCH

The issue is that its matching with numbers that have decimals before and after. (the same with special characters like \, $, %, etc.). I want this results

22193 ---------MATCH
22192-2222 ------MATCH
.22193
hello this is .221938
hello this is .22012
hello this is 22193 -----MATCH
22193 hello ---- MATCH
221931
22193.2222

How can I match a string that does not have special characters in between and before/after the zip code? Please can you insist me? I've trying to play it around in regextester.com but no luck. Any suggestions?

Upvotes: 1

Views: 243

Answers (2)

JvdV
JvdV

Reputation: 75840

Maybe what you could do is the following:

((?<=\s|^)\d{5}(?=\s|$|-\d{4}(?=\s|$)))(?:-\d{4}(?=\s|$))?

See the Online Demo


  • ( - Starting 1st capturing group.
    • (?<=\s|^) - Positive lookbehind for either a space-character or a start string ancor.
    • \d{5} - Match exactly five digits ranging 0-9.
    • (?=\s|$|-\d{4}(?=\s|$)) - A positive lookahead to match either a space-character or an end-string ancor or an hyphen followed by exactly four digits ranging 0-9 with a nested positive lookahead to check for either a space-character or end-string ancor.
  • ) - Closing capture group 1.
  • (?: - Starting a non-capture group
    • -\d{4} - Matching an hyphen followed by exactly four digits ranging 0-9.
    • (?=\s|$) - A positive lookahead for either a space-character or end-string ancor.
  • )? - Closing non-capture group and making it optional.

enter image description here

Upvotes: 1

CoderX
CoderX

Reputation: 302

You can follow this regex. I have used your mentioned special characters.

(?<!(?:\.|\$|\%|\\))\b(?:[0-9]{5})(?:-[0-9]{4})?\b(?!(?:\.|\$|\%|\\))

Check the demo

Upvotes: 0

Related Questions