Reputation: 1989
I have the following string in $str
:
assign (rregbus_z_partially_resident | regbus_s_partially_resident | reg_two | )regbus_;
I want to parse this line and only capture all the string that starts with non-word character followed by either reg_\w+
or regbus_\w+
into an array.
so in the above example, i want to capture only
regbus_s_partially_resident
and reg_two
into a array.
I tried this and it didnot work:
my (@all_matches) = ($str =~ m/\W(reg_\w+)|\W(regbus_\w+)/g);
Since i am trying to use \W
, its copying the non-word character also into the array list, which i donot want.
Upvotes: 1
Views: 5519
Reputation: 66873
Need a little tweak to your regex
my @all_matches = $str =~ m/\W(reg_\w+|regbus_\w+)/g;
or
my @all_matches = $str =~ m/\W( (?:reg|regbus)_\w+ )/gx;
or even something along the lines of
my @all_matches = $str =~ m/\W( reg(?:bus)?_\w+ )/gx;
The most suitable form depends on what patterns you may need and how this is used.
Or, reduce the regex use to the heart of the problem
my @matches = grep { /^(?:reg_\w+|regbus_\w+)/ } split /\W/, $str;
what may be helpful if your strings and/or requirements grow more complex.
Upvotes: 2
Reputation: 385506
its copying the non-word character also into the array list
No, it doesn't.
$ perl -le'
my $str = "assign (rregbus_z_partially_resident | regbus_s_partially_resident | reg_two | )regbus_;";
my (@all_matches) = ($str =~ m/\W(reg_\w+)|\W(regbus_\w+)/g);
print $_ // "[undef]" for @all_matches;
'
[undef]
regbus_s_partially_resident
reg_two
[undef]
But you do have a problem: You have two captures, so you will get two values per match.
Fix:
my @all_matches;
push @all_matches, $1 // $2 while $str =~ m/\W(reg_\w+)|\W(regbus_\w+)/g;
Far better:
my @all_matches = $str =~ m/\W(reg(?:bus)?_\w+)/g;
Ever better yet:
my @all_matches = $str =~ m/\b(reg(?:bus)?_\w+)/g;
Upvotes: 2