Reputation: 6744
I am trying to match the two address lines below (mostly fictional addresses):
2320 ZINER CIR East 43123
1111 ZINER CIR East Bernstadt 43123
My regular expression is built using names of cities, and East Bernstadt is a city name. However, streets can also end in "East". My predicament then is that if I greedy match "East", as in:
\d+ [^ ]+ CIR( East)?( East Bernstadt)?(?: \d+)?
...then only the fist line is matched (the other is a partial match). If I use a reluctant match, as in:
\d+ [^ ]+ CIR( East)??( East Bernstadt)?(?: \d+)?
...the second line matches but not the first.
How can I change the regular expression so that both lines are matched completely? "East" and "East Bernstadt" must remain in separate parts of the expression.
EDIT: I cannot treat "East" and "East Bernstadt" with one parenthesis group; both expressions above must match, but also "1234 Ziner CIR East East Bernstadt" must match as well (some streets have cardinal directions on them).
Upvotes: 1
Views: 45
Reputation: 2557
Try this
\d+\s+\S+\s+CIR(?:(?!\sEast Bernstadt)\s+East)?(?:\s+East Bernstadt)?(?: +\d+)?
Explanation:
\s
: "whitespace character": space, tab, newline, carriage return, vertical tab sample
\S
: One character that is not a whitespace character as defined by \S
sample
(?!…)
: Negative lookahead sample
Upvotes: 1