badera
badera

Reputation: 1545

regex capture includes too much

I have a string from which I would like to caputre all after and including colon until (excluding) white space or paranthesis.

Why does the following regex include the paranthesis in the string match? :(.*?)[\(\)\s] or also :(.+?)[\)\s] (non-greedy) does not work.

Example input: WHERE t.operator_id = :operatorID AND (t.merchant_id = :merchantID) AND t.readerApplication_id = :readerApplicationID AND t.accountType in :accountTypes

Should exctract :operatorID, :merchantID, :readerApplicationID, :accountTypes. But my regexes extract for the second match :marchantID) What is wrong and why?

Even if I use an exacter mapping condition in the capture, it does not work:
:([a-zA-z0-9_]+?)[\)\(\s]

Upvotes: 1

Views: 490

Answers (1)

Scott Weaver
Scott Weaver

Reputation: 7361

Put your conditional "followed by space or paren" as a lookahead, so that it sees but doesn't match. Right now you are explicitly matching parentheses with [\(\)\s]:

:(.+?)(?=[\s\(\)])

https://regex101.com/r/im8KWF/1/

Or, use the built-in \b "word boundary", which is also a "zero-width" assertion meaning the same thing*:

:(.+?)\b

https://regex101.com/r/FnnzGM/3/

*Definition of word boundary from regular-expressions.info:

There are three different positions that qualify as word boundaries:

Before the first character in the string, if the first character is a word character. After the last character in the string, if the last character is a word character. Between two characters in the string, where one is a word character and the other is not a word character.

Upvotes: 2

Related Questions