Reputation: 3543
The file contents should be like this:
foo
foobar
bar
foo or foobar can exist alone. But bar must be after foo and can't occur without foo. So any occurrences of bar without foo, I want to delete.
For my use case (the issue), bar was always coming before foo once and after foo once.
bar
...
foo
foobar
bar
So I used grep
to find the number of occurrences and sed
to delete the 1st occurrence of bar if there are more than one bar.
But I was wondering, is it possible with sed or some other tool to actually find the previous occurrence of foo and keep a count. If it's not followed by a foo of its own, delete the bar. Like all the cases below.
1.(delete 1st bar)
bar
...
foo
foobar
bar
2.(delete 2nd bar because even though there is a foo before it, it's already counted for 1st bar)
foo
foobar
bar
...
bar
3.(delete 2nd and 4th bar)
foo
foobar
bar
...
bar
...
foo
bar
...
bar
Upvotes: 0
Views: 1063
Reputation: 180093
is it possible with sed or some other tool to actually find the previous occurrence of foo and keep a count.
It is possible with sed
, in principle, to keep a running count the number of appearances of foo
. For example, performing an H
command for each foo
line would do this, though the resulting count would be in a form that was a bit tricky to use.
But it sounds like you don't really need a count so much as you need a flag to report on whether any foo
s have yet been seen. For that, I would just use an h
command. An overall sed
program to delete all bar
lines that precede the first foo
line, leaving all other lines unchanged, could look like this:
# When a 'foo' line is encountered, copy it to the hold space
/^foo$/ h
# If is a 'bar' then print or delete it, as appropriate
/^bar$/ {
# Append a newline and the contents of the hold space to the pattern space
G
# If the pattern space (now) ends in `foo`, then print up to the first newline of it
/foo$/ P
# delete the contents of the pattern space and start the next cycle
d
}
You can put that in a file and use sed
's -f
option to read the commands from there, or you can put it directly on the command line by removing the comments and concatenating the lines with semicolon delimiters:
sed '/^foo$/ h; /^bar$/ {G; /foo$/ P; d; }' input > output
Simpler, though, and clearer would be to express what you intend via an address range:
# For all lines from before the first through one containing foo
0,/^foo$/ {
# delete bar lines
/^bar$/ d
}
Or in a single command:
sed '0,/^foo$/ { /^bar$/ d; }' input > output
However, use of line number 0 as an address or in an address range may require GNU sed
. It definitely works with GNU sed
, but POSIX specifications for sed
do not clearly indicate that it is supported.
Upvotes: 2
Reputation: 5965
awk '/^bar/{if (k) k=0; else next} /^foo/{k=1} 1' file
k
means keep the line, it expires at first bar
found after foo
.
Upvotes: 3