thomas
thomas

Reputation: 1174

Regex Non capturing group - Useless?

I'm trying to get my head around this concept, but I really can't see how it's useful, so I'm assuming I'm missing the point.

For example -

This regex /([0-9]+)(?:st|nd|rd|th)?/ will match numbers with or without a 'st', 'rd' etc. suffix.

So "1st".match(/([0-9]+)(?:st|nd|rd|th)?/g) returns ["1st"]

"1".match(/([0-9]+)(?:st|nd|rd|th)?/g) returns ["1"]

However, this still works exactly the same without the (?:) par!

"1st".match(/([0-9]+)(st|nd|rd|th)?/g) returns ["1st"]

Thanks...

Upvotes: 2

Views: 2822

Answers (4)

Bergi
Bergi

Reputation: 664297

You don't see a difference because your regex has the global flag, which when using match will only yield an array with all results from the whole string. Try this (without g) instead:

> "1st".match(/([0-9]+)(?:st|nd|rd|th)?/)
["1st", "1"]
> "1".match(/([0-9]+)(?:st|nd|rd|th)?/)
["1", "1"]
> "1st".match(/([0-9]+)(st|nd|rd|th)?/)
["1st", "1", "st"]

Using the exec method would have the same behaviour.

Upvotes: 2

Explosion Pills
Explosion Pills

Reputation: 191729

Non-capture grouping is faster because the regex engine doesn't have to keep track of the match. It can be a good idea not to capture what you don't need to capture for the sake of clarity. For example:

(foo|bar)((z|q)s?)?

This is somewhat contrived, but you could easily apply it to a real regex. You can match fooz or foozs. We are interested in the foo and bar part as well as z or q, but we don't care about the optional s. So which part is z or q? Is it capture group 2 or 3? Imagine if we had (?:(z|q) instead. Now, we know there are only two capture groups so we don't have to make this mental jump.


Sometimes non-capturing is necessary such as for JavaScript's .split.

If separator contains capturing parentheses, matched results are returned in the array.

If you want to use grouping for splits but you don't want to include the split regex in the array, you must use non-capturing groups.

Upvotes: 6

Baruch Oxman
Baruch Oxman

Reputation: 1636

From python 're' documentation: https://docs.python.org/2/library/re.html

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

Upvotes: 0

pdw
pdw

Reputation: 8866

It's useful when you have very complex regexes with many capturing groups, and you don't want to clutter your code with groups that solely exist for structuring the regexp. You want to avoid the "Group 1 is data A, group 2 is data B, groups 3 and 4 are junk, group 5 is data C" scenario.

Non-capturing groups are also slightly faster.

Upvotes: 1

Related Questions