Dirk
Dirk

Reputation: 6884

How can I grab multiple lines after a matching line in Perl?

I'm parsing a large file in Perl line-by-line (terminated by \n), but when I reach a certain keyword, say "TARGET", I need to grab all the lines between TARGET and the next completely empty line.

So, given a segment of a file:

Line 1
Line 2
Line 3
Line 4 Target
Line 5 Grab this line
Line 6 Grab this line
\n

It should become:
Line 4 Target
Line 5 Grab this line
Line 6 Grab this line

The reason I'm having trouble is I'm already going through the file line-by-line; how do I change what I delimit by midway through the parsing process?

Upvotes: 11

Views: 31328

Answers (9)

Sumathi Gokul
Sumathi Gokul

Reputation: 111

while (<IN>) {
print OUT if (/Target/../^$/) ; 
}   

Upvotes: 0

brian d foy
brian d foy

Reputation: 132812

From perlfaq6's answer to How can I pull out lines between two patterns that are themselves on different lines?


You can use Perl's somewhat exotic .. operator (documented in perlop):

perl -ne 'print if /START/ .. /END/' file1 file2 ...

If you wanted text and not lines, you would use

perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...

But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.

Here's another example of using ..:

while (<>) {
    $in_header =   1  .. /^$/;
    $in_body   = /^$/ .. eof;
# now choose between them
} continue {
    $. = 0 if eof;  # fix $.
}

Upvotes: 4

Greg Bacon
Greg Bacon

Reputation: 139491

The range operator is ideal for this sort of task:

$ cat try
#! /usr/bin/perl

while (<DATA>) {
  print if /\btarget\b/i .. /^\s*$/
}

__DATA__
Line 1
Line 2
Line 3
Line 4 Target
Line 5 Grab this line
Line 6 Grab this line

Nope
Line 7 Target
Linu 8 Yep

Nope again

$ ./try
Line 4 Target
Line 5 Grab this line
Line 6 Grab this line

Line 7 Target
Linu 8 Yep

Upvotes: 14

telesphore4
telesphore4

Reputation: 887

use strict;
use warnings;

my $inside = 0;
my $data = '';
while (<DATA>) {
    $inside = 1 if /Target/;
    last if /^$/ and $inside;
    $data .= $_ if $inside;
}

print '[' . $data . ']';

__DATA__
Line 1
Line 2
Line 3
Line 4 Target
Line 5 Grab this line
Line 6 Grab this line

Next Line

Edit to fix the exit condition as per the note below.

Upvotes: 1

Graeme Perrow
Graeme Perrow

Reputation: 57248

If you only want one loop (modifying Dave Hinton's code):

my @grabbed;
my $grabbing = 0;
while (<FILE>) {
    if (/TARGET/ ) {
       $grabbing = 1;
    } elsif( /^$/ ) {
       $grabbing = 0;
    }
    if ($grabbing) {
        push @grabbed, @_;
    }
}

Upvotes: 0

mirod
mirod

Reputation: 16161

The short answer: line delimiter in perl is $/, so when you hit TARGET, you can set $/ to "\n\n", read the next "line", then set it back to "\n"... et voilà!

Now for the longer one: if you use the English module (which gives sensible names to all of Perl's magic variable, then $/ is called $RS or $INPUT_RECORD_SEPARATOR. If you use IO::Handle, then IO::Handle->input_record_separator( "\n\n") will work.

And if you're doing this as part of a bigger piece of code, don't forget to either localize (using local $/; in the appropriate scope) or to set back $/ to its original value of "\n".

Upvotes: 10

C. K. Young
C. K. Young

Reputation: 223023

If you don't mind ugly auto-generated code, and assuming you just want lines between TARGET and the next empty line, and want all the other lines to be dropped, you can use the output of this command:

s2p -ne '/TARGET/,/^$/p'

(Yes, this is a hint that this problem is usually much more easily solved in sed. :-P)

Upvotes: 0

user105033
user105033

Reputation: 19568

while(<FILE>)
{
    if (/target/i)
    {
        $buffer .= $_;
        while(<FILE>)
        {
            $buffer .= $_;
            last if /^\n$/;
        }
    }
}

Upvotes: 2

dave4420
dave4420

Reputation: 47052

You want something like this:

my @grabbed;
while (<FILE>) {
    if (/TARGET/) {
        push @grabbed, $_;
        while (<FILE>) {
            last if /^$/;
            push @grabbed, $_;
        }
    }
}

Upvotes: 23

Related Questions