juliushibert
juliushibert

Reputation: 414

Delete lines before and after a match in bash (with sed or awk)?

I'm trying to delete two lines either side of a pattern match from a file full of transactions. Ie. find the match then delete two lines before it, then delete two lines after it and then delete the match. The write this back to the original file.

So the input data is

D28/10/2011
T-3.48
PINITIAL BALANCE
M
^

and my pattern is

sed -i '/PINITIAL BALANCE/,+2d' test.txt

However this is only deleting two lines after the pattern match and then deleting the pattern match. I can't work out any logical way to delete all 5 lines of data from the original file using sed.

Upvotes: 10

Views: 23146

Answers (6)

user15606443
user15606443

Reputation:

A more simple and easy to understand solution might be:

awk '/PINITIAL BALANCE/ {print NR-2 "," NR+2 "d"}' input_filename \
    | sed -f - input_filename > output_filename

awk is used to make a sed-script that deletes the lines in question and the result is written on the output_filename.

This uses two processes which might be less efficient than the other answers.

Upvotes: 6

choroba
choroba

Reputation: 241868

For such a task, I would probably reach for a more advanced tool like Perl:

perl -ne 'push @x, $_;
          if (@x > 4) {
              if ($x[2] =~ /PINITIAL BALANCE/) { undef @x }
                  else { print shift @x }
          }
          END { print @x }' input-file > output-file

This will remove 5 lines from the input file. These lines will be the 2 lines before the match, the matched line, and the two lines afterwards. You can change the total number of lines being removed modifying @x > 4 (this removes 5 lines) and the line being matched modifying $x[2] (this makes the match on the third line to be removed and so removes the two lines before the match).

Upvotes: 2

rush
rush

Reputation: 2564

sed will do it:

sed '/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

It works this way:

  • if sed has only one string in pattern space it joins another one
  • if there are only two it joins the third one
  • if it does natch to pattern LINE + LINE + LINE with BALANCE it joins two following strings, deletes them and goes at the beginning
  • if not, it prints the first string from pattern and deletes it and goes at the beginning without swiping the pattern space

To prevent the appearance of pattern on the first string you should modify the script:

sed '1{/PINITIAL BALANCE/{N;N;d}};/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

However, it fails in case you have another PINITIAL BALANCE in string which are going to be deleted. However, other solutions fails too =)

Upvotes: 5

potong
potong

Reputation: 58420

This might work for you (GNU sed):

sed ':a;$q;N;s/\n/&/2;Ta;/\nPINITIAL BALANCE$/!{P;D};$q;N;$q;N;d' file

Upvotes: 1

alinsoar
alinsoar

Reputation: 15793

save this code into a file grep.sed

H
s:.*::
x
s:^\n::
:r
/PINITIAL BALANCE/ {
    N
    N
    d    
}

/.*\n.*\n/ {
    P
    D
}
x
d

and run a command like this:

`sed -i -f grep.sed FILE`

You can use it so either:

sed -i 'H;s:.*::;x;s:^\n::;:r;/PINITIAL BALANCE/{N;N;d;};/.*\n.*\n/{P;D;};x;d' FILE

Upvotes: 0

Kent
Kent

Reputation: 195059

an awk one-liner may do the job:

awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file

test:

kent$  cat file
######
foo
D28/10/2011
T-3.48
PINITIAL BALANCE
M
x
bar
######
this line will be kept
here
comes
PINITIAL BALANCE
again
blah
this line will be kept too
########

kent$  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file
######
foo
bar
######
this line will be kept
this line will be kept too
########

add some explanation

  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}   #if match found, add the line and +- 2 lines' line number in an array "d"
      {a[NR]=$0} # save all lines in an array with line number as index
      END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' #finally print only those index not in array "d"
     file  # your input file

Upvotes: 9

Related Questions