Reputation: 782
I am wanting to find a way to write a regular expression to search for occurrences of a string which begins with a specified beginning substring and ends with another specified ending string but whose total lengths are minimal. For example, if my beginning string was bar
and my ending string was foo
when searching through the string barbazbarbazfoobazfoo
then I would want to have it return barbazfoo
.
I am aware of how to do this if it were just a single character at one end or the other, for example in replacing the words above with characters I could search using a[^a].*?b
in order to find the the string axb
within the string axaxbxb
, but since I am looking for words rather than characters I can't simply say that I don't want any of a particular letter since the letter is allowed to appear inbetween.
For context, I am attempting to read through logs from a server and would like to find for example which users encountered a specific error, but there is additional information between where the username appears and where the information about the exceptions occur. As such, I am not looking for a solution which uses the fact that foo
in the above example has the only occurrences of the letters f
and o
.
Additional example: From the first paragraph on this regex tutorial about lookahead and lookbehind
The text reads:
Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions just like the start and end of line, and start and end of word anchors explained earlier in this tutorial. The difference is that lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not. Lookaround allows you to create regular expressions that are impossible to create without them, or that would get very longwinded without them.
If my start word was lookaround
and my end word was match
then I expect to have found the substring lookaround actually match
, noting that there are potentially multiple occurrences of the target words and an unknown number of words and characters inbetween possibly sharing characters with the target words. In the above example for instance lookaround[^lookaround]*?match
comes back as not having found a match as the syntax appears to be looking to avoid each of the letters l
,o
,k
,... individually. I am looking to see how I can have it look to avoid substrings rather than individual letters.
Upvotes: 2
Views: 915
Reputation: 91488
You have to use Tempered Greedy Token:
\blookaround\b(?:(?!\b(?:match|lookaround)\b).)*\bmatch\b
matches lookaround actually matches characters, but then gives up the match
lookaround(?:(?!(?:match|lookaround)).)*match
matches lookaround actually match
Upvotes: 1