Reputation: 483
I want to understand the situation with regular expression in Perl.
$str = "123-abc 23-rr";
Need to show both words beside minus. Regular expression is:
@mas=$str=~/(?:([\d\w]+)\-([\d\w]+))/gx;
And it show right output: 123
, abc
, 23
, rr
.
But if I change string a little and put one word in start:
$str = "word 123-abc 23-rr";
And I want to take account this first word, so I change my regexp:
@mas=$str=~/\w+\s(?:\s*([\d\w]+)\-([\d\w]+))*/gx;
My output must be same, but there are: 23
, rr
. If I remove \s*
or *
the output is 123
, abc
. But it's still not right. Anyone knows why?
Upvotes: 0
Views: 118
Reputation: 164769
Rather than making an ever more specific regex for an ever more specific string, consider taking advantage of the overall pattern.
First split the pieces on whitespace.
my @pieces = split /\s+/, $str;
Then remove the first piece, it doesn't have to be split.
my $word = shift @pieces;
Then split each piece on -
into pairs.
my %pairs = map { split /-/, $_ } @words;
Upvotes: 1
Reputation: 385657
For each match, each capture is returned.
In the first snippet, the pattern matches twice.
123-abc 23-rr
\_____/ \___/
There are two captures, so four (2*2=4) values are returned.
In the second snippet, the pattern matches once.
word 123-abc 23-rr
\________________/
There are two captures, so two (2*1=2) values are returned.
Upvotes: 1