Sophistifunk
Sophistifunk

Reputation: 5042

How do I match across newlines in a perl regex?

I'm trying to work out how to match across newlines with perl (from the shell). following:

(echo a b c d e; echo f g h i j; echo l m n o p) | perl -pe 's/(c.*)/[$1]/'

I get this:

a b [c d e]
f g h i j
l m n o p

Which is what I expect. But when I place an /s at the end of my regex, I get this:

a b [c d e
]f g h i j
l m n o p

What I expect and want it to print is this:

a b [c d e
f g h i j
l m n o p
]

Is the problem with my regex somehow, or my perl invocation flags?

Upvotes: 5

Views: 2660

Answers (5)

Ben Deutsch
Ben Deutsch

Reputation: 708

There's More Than One Way To Do It: since you're reading "the entire file at a time" anyway, I'd personally drop the -p modifier, slurp the entire input explicitly, and go from there:

echo -e "a b c d e\nf g h i j\nl m n o p" | perl -e '$/ = undef; $_ = <>; s/(c.*)/[$1]/s; print;'

This solution does have more characters, but may be a bit more understandable for other readers (which will be you in three months time ;-D )

Upvotes: 2

Josh Y.
Josh Y.

Reputation: 876

-p loops over input line-by-line, where "lines" are separated by $/, the input record separator, which is a newline by default. If you want to slurp all of STDIN into $_ for matching, use -0777.

$ echo "a b c d e\nf g h i j\nl m n o p" | perl -pe 's/(c.*)/[$1]/s'
a b [c d e
]f g h i j
l m n o p
$ echo "a b c d e\nf g h i j\nl m n o p" | perl -0777pe 's/(c.*)/[$1]/s'
a b [c d e
f g h i j
l m n o p
]

See Command Switches in perlrun for information on both those flags. -l (dash-ell) will also be useful.

Upvotes: 12

ikegami
ikegami

Reputation: 386561

You're reading a line at a time, so how do you think it can possibly match something that spans more than one line?

Add -0777 to redefine "line" to "file" (and don't forget to add /s to make . match newlines).

$ (echo a b c d e; echo f g h i j; echo l m n o p) | perl -0777pe's/(c.*)/[$1]/s'
a b [c d e
f g h i j
l m n o p
]

Upvotes: 1

Tudor Constantin
Tudor Constantin

Reputation: 26871

The problem is that your one-liner works one line at a time, your regex is fine:

use strict;
use warnings;
use 5.014;

my $s = qq|a b c d e
f g h i j
l m n o p|;

$s =~ s/(c.*)/[$1]/s;

say $s;

Upvotes: 2

Pavel Vlasov
Pavel Vlasov

Reputation: 3465

Actually your one-liner looks like this:

while (<>) {

     $ =~ s/(c.*)/[$1]/s;
}

It's mean that regexp works only with first line of your input.

Upvotes: 1

Related Questions