Reputation:
So I'm making a program that stats git repositories but I'm having trouble getting a certain regular expression to work. Basically, I have a string that looks like this:
my $string = "5 2 gitc"
and a regular expression that looks like this:
my ($added, $removed) = $string =~ /([0-9]*) *([0-9]*) *[a-z]*/;
My goal is to store the first number as $added and the second number as $removed, but for some reason no value is being stored in $removed. So if I use the print statement:
print "-$added $removed-\n";
the output looks like:
-5 -
when I test that regular expression on regex 101 my capture groups appear to work fine so I'm kind of stumped as to why it doesn't work. Can anyone see a problem with my regular expression?
Upvotes: 0
Views: 80
Reputation: 2808
As Kyle pointed out in the comments - if the digits have to be there then use +
instead of *
to reduce the number of possible matches the RE engine has to search through. Also, since \s
matches "whitespace" (defined here as [\ \t\r\n\f]
), you can cover the possibillity of tab characters throwing the match off by using it instead of a literal space character.
Using \s
to match whitespace also frees up the literal space character to assist with formatting in the regex itself. To do that, use 'extended mode' regexs by adding a /x
on the end of the specification.
Finally, as a general rule, test for a successful match before assigning to variables like so;
my $string = "5 2 gitc";
if ($string =~ /(\d+) \s+ (\d+) [a-z]+/x) {
my ($added, $removed) = ($1, $2);
}
else
print "Failed match\n";
}
Upvotes: 5