RatDon
RatDon

Reputation: 3543

sed remove a line matching a pattern if not after another matching pattern

The file contents should be like this:

foo
foobar
bar

foo or foobar can exist alone. But bar must be after foo and can't occur without foo. So any occurrences of bar without foo, I want to delete.

For my use case (the issue), bar was always coming before foo once and after foo once.

bar
...
foo
foobar
bar

So I used grep to find the number of occurrences and sed to delete the 1st occurrence of bar if there are more than one bar.

But I was wondering, is it possible with sed or some other tool to actually find the previous occurrence of foo and keep a count. If it's not followed by a foo of its own, delete the bar. Like all the cases below.

1.(delete 1st bar)

bar
...
foo
foobar
bar

2.(delete 2nd bar because even though there is a foo before it, it's already counted for 1st bar)

foo
foobar
bar
...
bar

3.(delete 2nd and 4th bar)

foo
foobar
bar
...
bar
...
foo
bar
...
bar

Upvotes: 0

Views: 1063

Answers (2)

John Bollinger
John Bollinger

Reputation: 180093

is it possible with sed or some other tool to actually find the previous occurrence of foo and keep a count.

It is possible with sed, in principle, to keep a running count the number of appearances of foo. For example, performing an H command for each foo line would do this, though the resulting count would be in a form that was a bit tricky to use.

But it sounds like you don't really need a count so much as you need a flag to report on whether any foos have yet been seen. For that, I would just use an h command. An overall sed program to delete all bar lines that precede the first foo line, leaving all other lines unchanged, could look like this:

# When a 'foo' line is encountered, copy it to the hold space
/^foo$/ h
# If is a 'bar' then print or delete it, as appropriate
/^bar$/ {
# Append a newline and the contents of the hold space to the pattern space
G
# If the pattern space (now) ends in `foo`, then print up to the first newline of it
/foo$/ P
# delete the contents of the pattern space and start the next cycle
d
}

You can put that in a file and use sed's -f option to read the commands from there, or you can put it directly on the command line by removing the comments and concatenating the lines with semicolon delimiters:

sed '/^foo$/ h; /^bar$/ {G; /foo$/ P; d; }' input > output

Update

Simpler, though, and clearer would be to express what you intend via an address range:

# For all lines from before the first through one containing foo
0,/^foo$/ {
# delete bar lines
/^bar$/ d
}

Or in a single command:

sed '0,/^foo$/ { /^bar$/ d; }' input > output

However, use of line number 0 as an address or in an address range may require GNU sed. It definitely works with GNU sed, but POSIX specifications for sed do not clearly indicate that it is supported.

Upvotes: 2

thanasisp
thanasisp

Reputation: 5965

awk '/^bar/{if (k) k=0; else next} /^foo/{k=1} 1' file

k means keep the line, it expires at first bar found after foo.

Upvotes: 3

Related Questions