Joel Berger
Joel Berger

Reputation: 20280

Count the capture groups in a qr regex?

I am working on a project which at one point gets a list of files from an ftp server. At that point it either returns an arrayref of files OR if an optional regex reference (i.e. qr), is passed it filters the list down using grep. Further if that qr has a capture group, it treats the captured section as a version number and returns instead a hashref where the keys are the versions and the values are the file names (which would have been returned as the array if no capture groups). The code looks like (simplified slightly)

sub filter_files {
  my ($files, $pattern) = @_;
  my @files = @$files;
  unless ($pattern) {
    return \@files;
  }

  @files = grep { $_ =~ $pattern } @files;
  carp "Could not find any matching files" unless @files;

  my %versions = 
    map { 
      if ($_ =~ $pattern and defined $1) { 
        ( $1 => $_ )
      } else {
        ()
      }
    } 
    @files;

  if (scalar keys %versions) {
    return \%versions;
  } else {
    return \@files;
  }
}

This implementation tries to create the hash and returns it if it succeeds. My question, is can I detect that the qr has a capture group and only attempt to create the hash if it does?

Upvotes: 10

Views: 1594

Answers (3)

daxim
daxim

Reputation: 39158

See nparen in Regexp::Parser.

use strictures;
use Carp qw(carp);
use Regexp::Parser qw();
my $parser = Regexp::Parser->new;

sub filter_files {
    my ($files, $pattern) = @_;
    my @files = @$files;
    return \@files unless $pattern;

    carp sprintf('Could not inspect regex "%s": %s (%d)',
        $pattern, $parser->errmsg, $parser->errnum)
        unless $parser->regex($pattern);

    my %versions;
    @files = map {
        if (my ($capture) = $_ =~ $pattern) {
            $parser->nparen
                ? push @{ $versions{$capture} }, $_
                : $_
        } else {
            ()
        }
    } @files;
    carp 'Could not find any matching files' unless @files;

    return (scalar keys %versions)
        ? \%versions
        : \@files;
}

Another possibility to avoid inspecting the pattern is to simply rely on the value of $capture. It will be 1 (Perl true value) in the case of a successful match without capture. You can distinguish it from the occasional capture returning 1 because that one lack the IV flag.

Upvotes: 4

toolic
toolic

Reputation: 62121

You could use YAPE::Regex to parse the regular expression to see if there is a capture present:

use warnings;
use strict;
use YAPE::Regex;

filter_files(qr/foo.*/);
filter_files(qr/(foo).*/);

sub filter_files {
    my ($pattern) = @_;
    print "$pattern ";
    if (has_capture($pattern)) {
        print "yes capture\n";
    }
    else {
        print "no capture\n";
    }
}

sub has_capture {
    my ($pattern) = @_;
    my $cap = 0;
    my $p = YAPE::Regex->new($pattern);
    while ($p->next()) {
        if (scalar @{ $p->{CAPTURE} }) {
            $cap = 1;
            last;
        }
    }
    return $cap;
}

__END__

(?-xism:foo.*) no capture
(?-xism:(foo).*) yes capture

Upvotes: 3

Qtax
Qtax

Reputation: 33918

You could use something like:

sub capturing_groups{
    my $re = shift;
    "" =~ /|$re/;
    return $#+;
}

say capturing_groups qr/fo(.)b(..)/;

Output:

2

Upvotes: 20

Related Questions