SagiZiv
SagiZiv

Reputation: 1040

Finding a substring using regex

Disclaimer: This question is more from curiosity and will to learn a bit more about Regex, I know it can be achieved with other methods.

I have a string that represents a list, like so: "egg,eggplant,orange,egg", and I want to search for all the instances of the item egg in this list.

I can't search for the substring egg, because it would also return eggplant.

So, I tried to write a regex expression to solve this and got to this expression ((?:^|\w+,)egg(?:$|,\w+))+ (I used this website to build the regex)

Basically, it searches for the word egg at the beginning of the string, the end of the string and in-between commas (while making sure those aren't trailing commas).

And it works fine, except this edge case: "egg,eggplant,egg"

Based on this site, I can see that the first egg is matched but then the regex engine continues until the last comma. Then for the last egg it has the remaining sting ,egg which doesn't match…

So, what can I do to fix the expression and find all the instances of a word in a string that represent a list?

Upvotes: 1

Views: 244

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

(?<![^,])egg(?![^,])

Or its less efficient equivalent:

(?<=,|^)egg(?=,|$)

See the regex demo. Details:

  • (?<![^,]) - a negative lookbehind that requires start of string or comma to appear immediately to the left of the current location
  • egg - a word
  • (?![^,]) - a negative lookahead that requires end of string or comma to appear immediately to the right of the current location.

See the regex graph:

enter image description here

Upvotes: 2

Related Questions