Reputation: 355
Regex:
(\d+).*?((?:[a-z][a-z\s?]+)).*?((?:court|ct|street|st)).*?(UNT\s?[\d\w].*|#\s?[\d\w].*)/ig
Matching
119 testing str test court #123
119 testing stret test court # 123
119 testing strt ct UNT 123
119 testing st UNT dsff
123 testing blah ct
My current regex is capturing correctly on the first 4 entries. How can I make everything involving # and UNT optional so my final "123 testing blah ct" can have capturing groups too?
Upvotes: 1
Views: 318
Reputation:
You can't just make the ending optional, it won't match if it don't have to.
Have to induce it to continue.
That can be done with the EOL anchor $
.
Note that this part [a-z\s?]
is a class that matches a-z or whitespace or question mark literal.
Not sure if that's what you meant.
(?im)(\d+).*?((?:[a-z](?:[a-z]|[^\S\r\n])+)).*?((?:court|ct|street|st)).*?((?:UNT|\#)[^\S\r\n]?\w.*)?$
Explained:
(?im) # Modifiers: ignore case, multi-line
( \d+ ) # (1)
.*?
( # (2 start)
(?:
[a-z]
(?: [a-z] | [^\S\r\n] )+
)
) # (2 end)
.*?
( # (3 start)
(?: court | ct | street | st )
) # (3 end)
.*?
( # (4 start)
(?: UNT | \# )
[^\S\r\n]? \w .*
)? # (4 end)
$ # End of line (or string)
Upvotes: 1