Reputation: 4016
Consider the following string:
blah, foo(a,b), bar(c,d), yo
I want to extract a list of strings:
blah
foo(a,b)
bar(c,d)
yo
It seems to me that I should be able to use quote words here, but I'm struggling with the regex. Can someone help me out?
Upvotes: 2
Views: 78
Reputation: 4709
There is a solution given by Borodin for one of your question (which is similar to this question). A small change of regex will give you desire output: (this will not work for nested parentheses)
use strict;
use warnings;
use 5.010;
my $line = q<blah, foo(a,b), bar(c,d), yo>;
my @words = $line =~ / (?: \([^)]*\) | [^,] )+ /xg;
say for @words;
Output:
blah
foo(a,b)
bar(c,d)
yo
Upvotes: 1
Reputation: 10784
Perl has a little thing regex recursion, so you might be able to look for:
either a bare word like blah
containing no parentheses (\w+
)
a "call", like \w+\((?R)(, *(?R))*\)
The total regex is (\w+(\((?R)(, ?(?R))*\))?)
, which seems to work.
Upvotes: 3
Reputation: 626738
You can use the following regex to use in split:
\([^()]*\)(*SKIP)(*F)|\s*,\s*
With \([^()]*\)
, we match a (
followed with 0 or more characters other than (
or )
and then followed with )
. We fail the match with (*SKIP)(*F)
if that parenthetical construction is found, and then we only match the comma surrounded with optional whitespaces.
See demo
#!/usr/bin/perl
my $string= "blah, foo(a,b), bar(c,d), yo";
my @string = split /\([^()]*\)(*SKIP)(*F)|\s*,\s*/, $string;
foreach(@string) {
print "$_\n";
}
To account for commas inside nested balanced parentheses, you can use
my @string = split /\((?>[^()]|(?R))*\)(*SKIP)(*F)|\s*,\s*/, $string;
Here is an IDEONE demo
With \((?>[^()]|(?R))*\)
we match all balanced ()
s and fail the match if found with the verbs (*SKIP)(*F)
, and then we match a comma with optional whitespace around (so as not to manually trim the strings later).
For a blah, foo(b, (a,b)), bar(c,d), yo
string, the result is:
blah
foo(b, (a,b))
bar(c,d)
yo
Upvotes: 1