Reputation: 5626
According to the comments on this answer, (?:(?!ab).)*
is more efficient than (?!.*ab).*
. Why? Isn't a lookahead/lookbehind already non-capturing?
Basically, I'm trying to figure out if I need to do (\^(?:(?=(?:\d+=|\|$))))
or if I can just do (\^(?=\d+=|\|$))
. Both work to capture all ^
followed either by ###=
or |
..
Example:
1=5^2=A^3=6^|
I want to get three ^
matches (which I do). So, the question is: would I want to add the non-capturing groups if I'm already not capturing the contents of the lookahead?
Upvotes: 2
Views: 80
Reputation: 20909
Using non-capturing groups is useful for handling repeating patterns that you don't necessarily want to keep individually.
For example, lets say you're parsing out people's full names. A person can have any number of first and middle names, but only one last name. You want to capture their full name as well as their last name.
You know you can match the name segments with repeated \w+\s+
but because you don't know how many first/middle names the person has, this presents a problem.
You consider something like ^(\w+\s+)*(\w+)$
. This will capture the last name... but what capture group is it in? There's no way to know without already knowing how many first/middle names the person has.
That's where non-capturing groups come in. You need to repeat the \w+\s+
pattern, but you don't necessarily care about the specific values it grabs.
Now your expression looks like ^(?:\w+\s+)*(\w+)$
.
The full result is the person's whole name and capture group one is their last name. No more guessing where results are stored!
In your case, a look-ahead should suffice, but that doesn't mean non-capturing groups don't have their uses.
Upvotes: 2
Reputation: 336468
In your case, you don't need a capturing group since the lookahead already limits the scope of the alternation:
(\^(?:(?=(?:\d+=|\|$))))
can be rewritten without change in functionality as
(\^(?=\d+=|\|$))
The example at the start of the string is something else because it's using repetition inside/outside a group. Here there is a difference, not only in efficiency but also in the possible matches:
(?:(?!ab).)*
matches xxx
in "xxxab"
whereas
(?!.*ab).*
matches b
.
Upvotes: 2