nn4l
nn4l

Reputation: 965

simple multiline sed command does not quite work

I want to match some text including line feeds. The command below almost works, but it does not match the first line

(echo foo; echo foo; echo bar) | sed '1!N; s/foo.*bar/zap\nbaz/'
foo
zap
baz

Same problem here:

(echo foo; echo bar; echo bar) | sed '1!N; s/foo.*bar/zap\nbaz/'
foo
bar
bar

I have found a much more complex sed command which works correctly in both cases but I would rather fix the simple one (if possible), or at least understand why it does not work.

(echo foo; echo bar; echo bar) | sed -n '1h;1!H;${g;s/foo.*bar/zap\nbaz/p}'
zap
baz

Upvotes: 0

Views: 118

Answers (4)

potong
potong

Reputation: 58391

This might work for you (GNU sed):

sed '/foo/{:a;N;/foo.*bar/!ba;s//zap\nbaz/}' file

If the current line contains foo then append a newline and the next line and look for foo followed by bar (any number of characters apart including newlines). If this pattern is found replace it by zap\nbaz and print out the result. If not loop back to :a and repeat until it is found or the end-of-file (in which case the entire string in the pattern space will be printed out without any changes).

N.B. the N command will not allow you to read pass the end-of-file and will bail out it you try. The command s//zap\nbaz/ substitutes the current regexp with zap\nbaz where the current regexp is the last /.../ in this case /foo.*baz/.

An alternative without braces:

sed '/foo/!b;:a;N;/foo.*bar/!ba;s//zap\nbaz/' file

Upvotes: 0

Vytenis Bivainis
Vytenis Bivainis

Reputation: 2376

Here's a workaround

sed 's/$/\\n/' | tr -d '\n' | sed 's/foo.*bar/zap\\nbar/g' | sed 's/\\n/\n/g'

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203358

sed is very simply just not the right tool for anything involving multiple lines because it is line-oriented and as such is designed to handle one line at a time. All of sed's language constructs for handling multi-line input became obsolete in the mid-1970s when awk was invented because awk is record-oriented instead of line-oriented and so trivially handles newlines within records just like any other character. For example:

$ (echo foo; echo bar; echo bar) |
    awk -v RS= '{sub(/foo.*bar/,"zap\nbaz"); print}'
zap
baz

Any time you find yourself using more than s, g, and p (with -n) in sed or talking about "spaces" you have the wrong approach.

Upvotes: 3

Beta
Beta

Reputation: 99094

Your simple approach can hold at most two lines of the text in the pattern space at once, so it can't match a three-line pattern.

In particular:

(echo foo; echo foo; echo bar) | sed '1!N; s/foo.*bar/zap\nbaz/'
foo
zap
baz

It reads the first line (foo), finds no match, and prints foo. Then it reads the second (foo), appends the next (bar), finds a match and performs the replacement, and prints zap\nbaz.

In the second run:

(echo foo; echo bar; echo bar) | sed '1!N; s/foo.*bar/zap\nbaz/'
foo
bar
bar

It reads the first line (foo), finds no match, and prints foo. Then it reads the second (bar), appends the next (bar), finds no match and prints bar\nbar.

Upvotes: 0

Related Questions