KingBob
KingBob

Reputation: 37

Parenthesis text capturing (Perl RegEx)

I'm back with a follow-up to this question. Let's assume I have the text

====Example 1====
Some text that I want to get that
may include line breaks
or special ~!@#$%^&*() characters

====Example 2====
Some more text that I don't want to get.

and use $output = ($text =~ ====Example 1====\s*(.*?)\s*====); to try and get everything from "====Example 1====" to the four equal signs right before "Example 2".

Based on what I've seen on this site, regexpal.com, and by running it myself, Perl finds and matches the text, but $output remains null or is assigned "1". I'm pretty sure that I'm doing something wrong with the capturing parenthesis, but I can't figure out what. Any help would be appreciated. My full code is:

$text = "====Example 1====\n
Some text that I want to get this text\n
may include line breaks\n
or special ~!@#$%^&*() characters\n
\n
====Example 2====]\n
Some more filler text that I don't want to get.";
my ($output) = $text =~ /====Example 1====\s*(.*?)\s*====/;
die "un-defined" unless defined $output;
print $output;

Upvotes: 1

Views: 247

Answers (2)

Tim Kennedy
Tim Kennedy

Reputation: 6120

Two things.

  1. Apply the /s flag to the regex to let it know that the input to the regex might be multiple lines.
  2. Switch your parenthesis to be around $output instead of around the ($text ~= regex);.

Example:

($output) = $text =~ /====Example\s1====\s*(.*?)\s*====/s;

For example, putting it into a script like:

#!/usr/bin/env perl

$text="
====Example 1====
Some text that I want to get that
may include line breaks
or special ~!@#$%^&*() characters

====Example 2====
Some more text that I don't want to get.
";

print "full text:","\n";
&hr;
print "$text","\n";
&hr;

($output) = $text =~ /====Example\s1====\s*(.*?)\s*====/s;
print "desired output of regex:","\n";
&hr;
print "$output","\n";
&hr;

sub hr {
        print "-" x 80, "\n";
}

Leaves you output like:

bash$ perl test.pl
--------------------------------------------------------------------------------
full text:
--------------------------------------------------------------------------------

====Example 1====
Some text that I want to get that
may include line breaks
or special ~!@#0^&*() characters

====Example 2====
Some more text that I don't want to get.

--------------------------------------------------------------------------------
desired output of regex:
--------------------------------------------------------------------------------
Some text that I want to get that
may include line breaks
or special ~!@#0^&*() characters
--------------------------------------------------------------------------------

Upvotes: 1

mpapec
mpapec

Reputation: 50677

Try with parentheses to force list context, and use /s when matching so . can also match newlines,

my ($output) = $text =~ / /s;

Upvotes: 3

Related Questions