Apad
Apad

Reputation: 67

Recursively extract strings between quotes in a given string

Given a string with substrings within quotes, extract all such substrings

I have written the following piece of code but something tells me that it is ugly (although it does seem to do the trick)

my $str = 'printf ("hellp;world", and "this is ; also" and )';

loop:
if ($str =~ /"(.*?)"/) {
    my $substr = $1;
    $str =~ s/"$substr"//;
    print "$substr\n";
}
if ($str =~ /"/) {
    goto loop;
}
perl quotes.pl
hellp;world
this is ; also

So it does work as expected.

Upvotes: 0

Views: 174

Answers (1)

melpomene
melpomene

Reputation: 85767

You can do that directly by using the /g regex flag in either scalar context:

while ($str =~ /"([^"]*)"/g) {
    print "$1\n";
}

... or list context:

for my $match ($str =~ /"([^"]*)"/g) {
    print "$match\n";
}

I've also changed .*? to [^"]* because it's better to be specific about what you want to match.

/g is documented in perldoc perlop:

The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

In scalar context, each execution of m//g finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using the pos() function; see "pos" in perlfunc. A failed match normally resets the search position to the beginning of the string, but you can avoid that by adding the /c modifier (for example, m//gc). Modifying the target string also resets the search position.

(Emphasis mine.)

Upvotes: 3

Related Questions