rahuL
rahuL

Reputation: 3420

Modify next line based on pattern match on previous line

I have a small piece of code that adds a comment to a file by inserting a # in front of a line based on a pattern match. The challenge I'm facing is to add a # to the next line as well. Here's the code I've written so far:

#!/usr/bin/perl

use warnings;

open(FILE, "<extensions.txt") || die "File not found";
my @lines = <FILE>;
close(FILE);

my @newlines;
foreach(@lines) {
   $_ =~ s/\[google\.com/\#\[google\.com/g;
   push(@newlines,$_);
}

open(FILE, ">ext.txt") || die "File not found";
print FILE @newlines;
close(FILE);

So this searches for any line that begins with [google and replaces it with #[google. I want to comment out the next line as well.

Here's a sample file:

[google.com]
Once upon a time...

[google.com-out-outnew]
Meanwhile, in the land of ...

[yahoo.com]
Centuries ago, the scion of ....

Once I run the above script, I get:

#[google.com]
Once upon a time...

#[google.com-out-outnew]
Meanwhile, in the land of ...

[yahoo.com]
Centuries ago, the scion of ....

Here's a sample of the output I'm looking for:

#[google.com]
#Once upon a time...

#[google.com-out-outnew]
#Meanwhile, in the land of ...

[yahoo.com]
Centuries ago, the scion of ....

I know it should go after this line $_ =~ s/\[google\.com/\#\[google\.com/g; but what I am confused about is how to modify the next line and then skip it in the loop. Could someone explain how that's done, please?

Upvotes: 0

Views: 385

Answers (3)

TLP
TLP

Reputation: 67900

This is a one-liner, which can be done in (at least) two ways:

> perl -pwe'$_ .= "#" . <> if s/(?=\Q[google.com\E)/#/g;' google.txt
#[google.com]
#Once upon a time...

#[google.com-out-outnew]
#Meanwhile, in the land of ...

[yahoo.com]
Centuries ago, the scion of ....

The next line <> is appended to the current line $_, if the substitution is executed. The substitution is simply a lookahead assertion combined with a quotemeta escape \Q ... \E which will insert a # in front of the matched text.

A minor caveat is that if the string is found at the last line of the file, you will get an uninitialized warning, since the file handle will return undef at eof. Another unhandled edge case is if you get two google lines in a row, but I assumed your format does not allow that.

Another way to handle this would be to use paragraph mode, since it seems that your records are separated by double newlines (an empty line).

perl -00 -lpwe's/^/#/gm if /^\Q[google.com\E/' google.txt

Note that this requires the use of /m and /g modifier, which allows ^ to match newline, and multiple matches, respectively. -00 changes the input record separator to "" (a special case for \n\n), which will read the whole record into $_. The -l switch will remove the new input record separator \n\n before the substitution to avoid an extra #, and also replaces it once done.

You can run the one-liner as an in-place edit, or redirect output to a new file

perl -pi.bak -we ' ...' yourfile.txt       # in-place edit with backup
perl -pwe ' ... ' yourfile.txt > new.txt   # redirect to new file

Upvotes: 2

Dan Dascalescu
Dan Dascalescu

Reputation: 152018

Just set a flag equal to whether the pattern was found, then print the line preceded by a '#' if so, reset the flag, and skip to the next loop iteration.

You can look at the result of the s/// operator, which is the number of substitutions made.

Here's the code, rewritten according to modern Perl practices, and optimized so you don't need an array.

#!/usr/bin/perl
use strict;

my $pattern_found;

open my $file_in, "<extensions.txt" or die $!;
open my $file_out, ">ext.txt" or die $!;

while (<$file_in>) {
   if ($pattern_found) {
       $pattern_found = 0;
       print $file_out "#$_";
       next
   }
   $pattern_found = $_ =~ s/\[google\.com/\#\[google\.com/g;
   print $file_out $_;
}

Upvotes: 1

Miller
Miller

Reputation: 35198

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

my $srcfile = 'extensions.txt';
my $outfile = 'ext.txt';

open my $infh, '<', $srcfile;
open my $outfh, '>', $outfile;

my $comment_next_line = 0;

while (<$infh>) {
    if ($comment_next_line) {
        $comment_next_line = 0;
        s/^/#/;
    } elsif (s/(?=\[google\.com)/#/g) {
        $comment_next_line = 1;
    }

    $outfh->print($_);
}

Upvotes: 0

Related Questions