criz
criz

Reputation: 273

Can Perl regex capture and split strings in order of their position?

I have a sample string below that I would like to split using some patterns either enclosed in parenthesis or not. They must be split in order of their position of the string so when I join them they will still be the same.

my (@strArr) = $str =~ /^(. *?) |((. *?) )) $/;

  1. abc(def)ghi
    Result: abc, (def),ghi

  2. abc(def) ghi(jkl)
    Result: abc,(def) ,ghi, (jkl)

  3. abcdef(ghi)
    Result :abcdef,(ghi)

  4. (abc)
    Result: (abc)

  5. (abcd) efg
    Result: (abcd),efg

Are these possible using only one line of regex code? These needs to be stored in order to @strArr

Upvotes: 1

Views: 1180

Answers (3)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

Use negated character classes [^...]:

my (@strArr) = $str =~ /[^\s(]+|\([^)]*\)/g;

pattern details:

/
[^\s(]+    # one or more characters that aren't opening round brackets or white-spaces
|        # OR
\(         # a literal opening round bracket
[^)]*      # zero or more characters that aren't closing round brackets
\)         # a literal closing round bracket
/g # perform a global research

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626699

You may split the string with (\([^()]*\)) pattern that will match a ( symbol, then zero or more chars other than ( and ), and then a literal ), and will capture the whole substring matched into Group 1 so as Perl could put it to the resulting array.

The only minus is that you need to drop empty matches (with grep {/\S/}), but the overall solution looks quite readable:

my $str = "abc(def)ghi";
my $regexp = qr/( \( [^()]* \) )/x;
my @strArr = grep {/\S/} split /$regexp/, $str;
print join(", ", @strArr);

Output of the demo code above: abc, (def), ghi.

Upvotes: 4

criz
criz

Reputation: 273

I tried both Wiktor's and Casimir's examples. Both worked fine.

#!/usr/bin/perl
use strict;
use warnings;

my %testHash = (
    '0' => '',
    '1' => 'abc(def)ghi',
    '2' => 'abc(def)ghi(jkl)',
    '3' => 'abcdef(ghi)',
    '4' => '(abc)',
    '5' => '(abcd)efg'
);

# Solution 1
print "By Wiktor:\n";
foreach my $key ( sort keys %testHash ) {
    my $str = $testHash{$key};
    my $regexp = qr/( \( [^()]* \) )/x;
    my @strArr = grep {/\S/} split /$regexp/, $str;

    print "$str - ".join(", ", @strArr)."\n";
}

# Solution 2
print "\nBy Casimir:\n";
foreach my $key ( sort keys %testHash ) {
    my $str = $testHash{$key};
    my (@strArr) = $str =~ /[^\s(]+|\([^)]*\)/g;

    print "$str - ".join(", ", @strArr)."\n";
}




By Wiktor:
 -
abc(def)ghi - abc, (def), ghi
abc(def)ghi(jkl) - abc, (def), ghi, (jkl)
abcdef(ghi) - abcdef, (ghi)
(abc) - (abc)
(abcd)efg - (abcd), efg

By Casimir:
 -
abc(def)ghi - abc, (def), ghi
abc(def)ghi(jkl) - abc, (def), ghi, (jkl)
abcdef(ghi) - abcdef, (ghi)
(abc) - (abc)
(abcd)efg - (abcd), efg

Upvotes: 1

Related Questions