Jefferson Pugliese
Jefferson Pugliese

Reputation: 51

Non-Capturing and Capturing Groups - The right way

I'm trying to match an array of elements preceeded by a specific string in a line of text. For Example, match all pets in the text below:

fruits:apple,banana;pets:cat,dog,bird;colors:green,blue

/(?:pets:)(\w+[,|;])+/g**

Using the given regex I only could match the last word "bird"

Can anybody help me to understand the right way of using Non-Capturing and Capturing Groups?

Thanks!

Upvotes: 3

Views: 22826

Answers (2)

wp78de
wp78de

Reputation: 18950

Since you want to have each pet in a separate match and you are using PCRE \G is, as suggested by Wiktor, a decent option:

(?:pets:)|\G(?!^)(\w+)(?:[,;]|$)

Explanation:

  • 1st Alternative (?:pets:) to find the start of the pattern
  • 2nd Alternative \G(?!^)(\w+)(?:[,;]|$)
    • \G asserts position at the end of the previous match or the start of the string for the first match
    • Negative Lookahead (?!^) to assert that the Regex does not match at the start of the string
    • (\w+) to matches the pets
    • Non-capturing group (?:[,;]|$) used as a delimiter (matches a single character in the list ,; (case sensitive) or $ asserts position at the end of the string

Perl Code Sample:

use strict;
use Data::Dumper;

my $str = 'fruits:apple,banana;pets:cat,dog,bird;colors:green,blue';
my $regex = qr/(?:pets:)|\G(?!^)(\w+)(?:[,;]|$)/mp;
my @result = ();

while ( $str =~ /$regex/g ) {
    if ($1 ne '') {
        #print "$1\n";
        push @result, $1;
    }
}
print Dumper(\@result);

Upvotes: 3

Kevin DS.
Kevin DS.

Reputation: 131

First, let's talk about capturing and non-capturing group:

  • (?:...) non-capturing version, you're looking for this values, but don't need it
  • () capturing version, you want this values! You're searching for it

So:

(?:pets:) you searching for "pets" but don't want to capture it, after that point, you WANT to capture (if I've understood):

So try (?:pets:)([a-zA-Z,]+); ... You're searching for "pets:" (but don't want it !) and stop at the first ";" (and don't want it too).

Result is : Match 1 : cat,dog,bird

A better solution exists with 1 match == 1 pet.

Upvotes: 7

Related Questions