Reputation: 16067
I'm currently reading xml balises from a file but I tried to reduce this to this simple example.
#!/usr/bin/perl
use strict;
use warnings;
my $str = '<tag x="20" y="7" x="15" z="14"/>';
if($str =~ /<tag.*(x|y|z)=\"(\d+)\".*(x|y|z)=\"(\d+)\".*(x|y|z)=\"(\d+)\".*\/>/){
print "$1-$2\n";
print "$3-$4\n";
print "$5-$6\n";
}
As I understand my regex, the first x
should match the first group, the first y
the third group and the second x
the fifth group.
So I expect as output:
x-20
y-7
x-15
But I get
y-7
x-15
z-14
Could someone explain what's happening here?
Upvotes: 0
Views: 44
Reputation: 57650
Instead of .*
use \s+
. Becasue you actually want to match multiple space characters. not multiple any characters.
If this is really an assignment you should do it in a more proper way. And regular expression is not proper way for xml thing. As its assignment just write a parser. It easier than you think.
Upvotes: 1
Reputation: 50637
Use ?
to make *
, +
quantifiers non-greedy as these are greedy by default (ie. matching any char .
as much as possible)
$str =~ /<tag.*?(x|y|z)=\"(\d+)\".*?(x|y|z)=\"(\d+)\".*?(x|y|z)=\"(\d+)\".*\/>/
Upvotes: 1