Al Berger
Al Berger

Reputation: 1068

Using multiline regexps in Perl 5

I read the manuals about using multiline regexes in Perl 5, but still cannot figure out why the following ones don't work as intended:

#!/usr/bin/perl

use v5.20;

my $s = <<'ENDSTR';
aaa       : AAA
bbb       : BBB
ccc       : CCC
ENDSTR

my $m = 'bbb';

my $a = $s =~ s/.*^$m *: (.*?)$.*/$1/rsm;
my $b = $s =~ s/[.\n]*?^$m *: (.*)$[.\n]*/$1/rm;

print "a: $a\n";
print "b: $b\n";

The intended output of the program is

a: BBB
b: BBB

But these regexes produce:

a: BBB
ccc       : CCC

b: aaa       : AAA
bbb       : BBB
ccc       : CCC  

How to correct these regexes in order to get the needed matches?

Upvotes: 1

Views: 67

Answers (3)

Gaurav
Gaurav

Reputation: 1918

I think it might be easier to split that string by both \s*:\s* and \n. You can build a hash very easily with the output, although this approach won't work if you have : in one of your strings, while your regular expression does. The following code works for me:

#!/usr/bin/perl

use v5.20;

my $s = <<'ENDSTR';
aaa       : AAA
bbb       : BBB
ccc       : CCC
ENDSTR

my %hash = split(/(\s*:?\s*|\n)/, $s);
say $hash{'bbb'};

If you're trying to parse data in that format, you should try using Config::General, which can parse a simple configuration file format that's pretty similar to what you have, but also supports comments, blocks, and other cool things.

Upvotes: 0

JGNI
JGNI

Reputation: 4013

With the s flag you are allowing the . meta character to match line endings. Either remove it or change the .* at the end of the regex to .*?

Upvotes: 0

Al Berger
Al Berger

Reputation: 1068

On perlmonks.org I was advised with the correct variant:

my $a = $1 if  $s =~ s/^$m *: (.*?)$/$1/rsm;
my $b = $1 if  $s =~ s/^$m *: (.*)$/$1/rm;

Upvotes: 1

Related Questions