Population Xplosive
Population Xplosive

Reputation: 617

Delete n1 previous lines and n2 lines following with respect to a line containing a pattern

sed -e '/XXXX/,+4d' fv.out

I have to find a particular pattern in a file and delete 5 lines above and 4 lines below it simultaneously. I found out that the line above removes the line containing the pattern and four lines below it.

sed -e '/XXXX/,~5d' fv.out

In sed manual it was given that ~ represents the lines which is followed by the pattern. But when i tried it, it was the lines following the pattern that was deleted.

So, how do I delete 5 lines above and 4 lines below a line containing the pattern simultaneously?

Upvotes: 8

Views: 2732

Answers (5)

Robbie Clarken
Robbie Clarken

Reputation: 311

If you are happy to output the result to a file instead of stdout, vim can do it quite efficiently:

vim -c 'g/pattern/-5,+4d' -c 'w! outfile|q!' infile

or

vim -c 'g/pattern/-5,+4d' -c 'x' infile

to edit the file in-place.

Upvotes: 1

Birei
Birei

Reputation: 36282

One way using sed, assuming that the patterns are not close enough each other:

Content of script.sed:

## If line doesn't match the pattern...
/pattern/ ! { 

    ## Append line to 'hold space'.
    H   

    ## Copy content of 'hold space' to 'pattern space' to work with it.
    g   

    ## If there are more than 5 lines saved, print and remove the first
    ## one. It's like a FIFO.
    /\(\n[^\n]*\)\{6\}/ {

        ## Delete the first '\n' automatically added by previous 'H' command.
        s/^\n//
        ## Print until first '\n'.
        P   
        ## Delete data printed just before.
        s/[^\n]*//
        ## Save updated content to 'hold space'.
        h   
    } 

### Added to fix an error pointed out by potong in comments.
### =======================================================
    ## If last line, print lines left in 'hold space'.
    $ { 
        x   
        s/^\n//
        p   
    } 
### =======================================================


    ## Read next line.
    b   
}

## If line matches the pattern...
/pattern/ {

    ## Remove all content of 'hold space'. It has the five previous
    ## lines, which won't be printed.
    x   
    s/^.*$//
    x   

    ## Read next four lines and append them to 'pattern space'.
    N ; N ; N ; N 

    ## Delete all.
    s/^.*$//
}

Run like:

sed -nf script.sed infile

Upvotes: 5

potong
potong

Reputation: 58478

This might work for you:

sed 'H;$!d;g;s/\([^\n]*\n\)\{5\}[^\n]*PATTERN\([^\n]*\n\)\{5\}//g;s/.//' file

or this:

awk --posix -vORS='' -vRS='([^\n]*\n){5}[^\n]*PATTERN([^\n]*\n){5}' 1 file

a more efficient sed solution:

sed ':a;/PATTERN/,+4d;/\([^\n]*\n\)\{5\}/{P;D};$q;N;ba' file

Upvotes: 1

jfg956
jfg956

Reputation: 16748

A solution using awk:

awk '$0 ~ "XXXX" { lines2del = 5; nlines = 0; }
     nlines == 5 { print lines[NR%5]; nlines-- }
     lines2del == 0 { lines[NR%5] = $0; nlines++ }
     lines2del > 0 { lines2del-- }
     END { while (nlines-- > 0)  { print lines[(NR - nlines) % 5] } }' fv.out

Update:

This is the script explained:

  • I remember the last 5 lines in the array lines using rotatory indexes (NR%5; NR is the record number; in this case lines).
  • If I find the pattern in the current line ($0 ~ "XXXX; $0 being the current record: in this case a line; and ~ being the Extended Regular Expression match operator), I reset the number of lines read and note that I have 5 lines to delete (including the current line).
  • If I already read 5 lines, I print the current line.
  • If I do not have lines to delete (which is also true if I had read 5 lines, I put the current line in the buffer and increment the number of lines. Note how the number of lines is decremented and then incremented if a line is printed.
  • If lines need to be deleted, I do not print anything and decrement the number of lines to delete.
  • At the end of the script, I print all the lines that are in the array.

My original version of the script was the following, but I ended up optimizing it to the above version:

awk '$0 ~ "XXXX" { lines2del = 5; nlines = 0; }
     lines2del == 0 && nlines == 5 { print lines[NR%5]; lines[NR%5] }
     lines2del == 0 && nlines < 5 { lines[NR%5] = $0; nlines++ }
     lines2del > 0 { lines2del-- }
     END { while (nlines-- > 0)  { print lines[(NR - nlines) % 5] } }' fv.out

awk is a great tool ! I strongly recommend that you find a tutorial on the net and read it. One important thing: awk works with Extended Regular Expressions (ERE). Their syntax is a little different from Standard Regular Expression (RE) used in sed, but all that can be done with RE can be done with ERE.

Upvotes: 2

jfg956
jfg956

Reputation: 16748

The idea is to read 5 lines without printing them. If you find the pattern, delete the unprinted lines and the 4 lines bellow. If you do not find the pattern, remember the current line and print the 1st unprinted line. At the end, print what is unprinted.

sed -n -e '/XXXX/,+4{x;s/.*//;x;d}' -e '1,5H' -e '6,${H;g;s/\n//;P;s/[^\n]*//;h}' -e '${g;s/\n//;p;d}' fv.out

Of course, this only works if you have one occurrence of your pattern in the file. If you have many, you need to read 5 new lines after finding your pattern, and it gets complicated if you again have your pattern in those lines. In this case, I think sed is not the right tool.

Upvotes: 1

Related Questions