Qiang Li
Qiang Li

Reputation: 10855

how to replace a block of text after a match in sed

In sed, I'd like to replace a multi-line block of text after a match, for example, after matching "foo", supposing its line number is 0. I want to replace the text block from line -3 to line +5, i.e. the text bock between the third line proceeding the matching line and the fifth line after the matching line, by another text block bar1\nbar2. I'd like to be able to do this in two scenarios:

1) Keep the matching line after the replaced block; 2) remove the matching line together with those lines -3 and +5.

Please help me.

Thank you.

Upvotes: 2

Views: 2614

Answers (3)

potong
potong

Reputation: 58381

This might work (GNU sed):

seq 31|sed 's/5/& match/' >/tmp/file
sed ':a;$q;N;s/\n/&/3;Ta;/match/!{P;D};:b;$bc;N;s/\n/&/8;Tb;:c;s/.*/bar1\nbar2/' /tmp/file
1
bar1
bar2
11
bar1
bar2
21
bar1
bar2
31
sed ':a;$q;N;s/\n/&/3;Ta;/match/!{P;D};h;s/\([^\n]*\n\)*\([^\n]*match[^\n]*\).*/\2/;x;:b;$bc;N;s/\n/&/8;Tb;:c;s/.*/bar1\nbar2/;G' /tmp/file
1
bar1
bar2
5 match
11
bar1
bar2
15 match
21
bar1
bar2
25 match
31

Explanation:

The commands fall into two halves:

  1. The first half keeps a moving window of 3 lines.
  2. Following a match 5 further lines are appended.

The details are as follows:

  • :a is a loop placeholder
  • $q on end-of-file print all lines within the pattern space (PS).
  • N append the next line to the PS
  • s/\n/&/3 replace the 3rd newline character by itself. This a counting device for checking that 3 lines are in the PS.
  • Ta if the previous substitution failed loop to the loop placeholder a
  • /match/!{P;D} look the match and if it fails print upto the first newline and then delete that line and it's newline (this invokes a new cycle).
  • :b is a loop placeholder N.B. a match has been found at this point.
  • $bc if end-of-file branch forward to the placeholder c
  • N append the next line to the PS
  • s/\n/&/8 replace the 8th (3 before 5 after) newline character by itself. This a counting device for checking that 5 lines are appended to the PS
  • Tb if the previous substitution failed loop to the loop placeholder b
  • :c is a loop placeholder
  • s/.*/bar1\nbar2/ replace the PS with the required string.

The second one liner makes a copy of the match line and appends it to the substituted string.

Alternative solutions:

sed -r ':a;$!N;s/[^\n]*/&/9;$!Ta;/^([^\n]*\n){3}([^\n]*match[^\n]*)\n.*/!{P;D};c\bar1\nbar2' file

sed -r ':a;$!N;s/[^\n]+/&/9;$!Ta;/^([^\n]*\n){3}([^\n]*match[^\n]*)\n.*/!{P;D};s//\bar1\nbar2\n\2/' file

Upvotes: 2

lynxlynxlynx
lynxlynxlynx

Reputation: 1433

Use N multiple times to read the eight lines and then you can match them as if they were concatenated — sed will reckognise \n in the pattern, so it is easy to work on individual parts (lines).

Example:

$ echo '1
2 oooh
3
4
match
5
6
7
8
9 oooh
10 ' | sed ': label; N; s/[^\n]*\n[^\n]*\n[^\n]*\nmatch\n[^\n]*\n[^\n]*\n[^\n]*\n[^\n]*\n[^\n]*\n/bar1\nbar2/; T label'

It reads on until it makes a substitution (T). Since you probably have more than one block to catch, change the T to b, so it will always branch. If it doesn't happen automatically already.

An shorter form as requested:

echo '1
2 oooh
3
4
match
5
6
7
8
9 oooh
10 ' | sed ': label; N; s/\([^\n]*\n\)\{3\}match\n\([^\n]*\n\)\{5\}/bar1\nbar2/; T label'

First we define a selfdocumenting sed label called "label". It enables us to jump into other code — think of it as a "goto" statement. Since it is at the start, jumping there will repeat all the sed commands. We really have there for a single purpose - N, which reads the next line and appends it to the pattern space. This is repeated over and over again, so we can get those context lines you want to check (and delete) and run a single regex over them. This is the job of the following s statement, which first looks for 3 repetitions (\{3\}) of the previous pattern group (\([^\n]*\n\)), which is any kind of line. Then it checks the next line for the marker string you're looking for (match in this example) and 5 more lines. If this multiline pattern matches, the substitution is made and the job is almost finished. We need to use the loop or the whole expression would be run for each line individually, reading ahead all the time and not doing what we want - read the lines in a batch.

Upvotes: 2

Birei
Birei

Reputation: 36262

One way using GNU sed for your second scenario, althought it seems a bit complex (it's fully commented):

Assuming infile has following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

And content of script.sed:

## From first line until a line that matches the pattern (number ten in 
## this example), save lines in buffer and print each one when there are
## more than three lines between one of them and the line with the pattern
## to search.
0,/10/ {

        ## Mark 'a'
        :a

        ## If line matches the pattern break this loop.
        /10/ {
                bb
        }

        ## Until the pattern matches, if more than three lines (checking '\n') are
        ## saved, print the oldest one and delete it, because I only want to save last
        ## three.
        /\(\n[^\n]*\)\{3\}/ {
                P
                D
        }

        ## Append next line to pattern space and goto mark 'a' in a loop.
        N
        ba
}

## It should never match (I think), but it's a sanity check to avoid the
## following mark 'b'.
bc

## Here we are when found the line with the pattern, so read next five six
## lines and delete all of them but the sixth. If end of file found in this
## process none of them will be printed, so it seems ok.
:b
N;N;N;N;N
N
s/^.*\n//

## Here we are after deleting both '-3' and '+5' lines from the pattern matched,
## so only is left to print the remainder of the file in a loop.
:c
p
N
s/^.*\n//
bc

Run it like this having into account that 10 is the pattern both in fifth and eleventh lines of code. Change it to your needs. In my example it should delete lines 7,8,9,10,11,12,13,14,15:

sed -nf script.sed infile

With following output:

1
2
3
4
5
6
16
17
18
19
20

Upvotes: 0

Related Questions