Reputation: 283
I have a number of lines that i get from a command output. They follow this pattern:
payload
constant value(u) constant(u)
payload
constant value(u) constant(u)
payload
In this example, (u) is an unknown character/characters.
What i care about is "payload", so i remove the "constant value(u) constant(u)" lines (by keeping every second line) using sed:
sed -n '1~2!p'
Sometimes, however, there is a duplicate "constant value(u) constant(u)" line and that makes sed to return all the following "constant value(u) constant(u)" lines instead of the "payload" lines .
I can use a regular expression to remove all "constant value(u) constant(u)" lines:
sed '/^constant.*constant.*$/d'
But the problem is that i must have a notion that this line was there, even if it's not a "payload" line, so i want to replace the content of this problematic duplicate line with some string. I want to replace only the "problematic" duplicate lines.
So, here is an example input in normal sutiation:
after 1 hour
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
after 2 hours
Cras id consequat nisl.
after 2 hours
Etiam non metus eu velit maximus dapibus.
after 1 hour
Etiam a mi quis ante congue posuere.
after 5 hours
Suspendisse et venenatis ipsum, aliquet pharetra tortor.
This is a "problematic" input:
after 1 hour
after 6 hours
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
after 2 hours
Cras id consequat nisl.
after 2 hours
Etiam non metus eu velit maximus dapibus.
after 1 hour
Etiam a mi quis ante congue posuere.
after 5 hours
Suspendisse et venenatis ipsum, aliquet pharetra tortor.
The desired output (in case of the problematic input above) is:
(no information)
after 6 hours
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
after 2 hours
Cras id consequat nisl.
after 2 hours
Etiam non metus eu velit maximus dapibus.
after 1 hour
Etiam a mi quis ante congue posuere.
after 5 hours
Suspendisse et venenatis ipsum, aliquet pharetra tortor.
How to approach this in the most efficient way? I guess i should match the "problematic" lines with regular expression and replace them with the desired string, but how?
Upvotes: 0
Views: 338
Reputation: 6449
This command will find 2 consecutive lines starting with constant
and replace the 2nd one with X
:
sed '/^constant.*$/ { N; s/\(^constant.*\n\)constant.*$/\1X/; }'
UPDATE
Based on the additional information you've provided, this should do the trick:
sed '/^after .*$/ { N; s/^after .*\(\nafter .*\)$/(no information)\1/; }'
UPDATE #2
Another solution provided by @potong in the comments:
sed -E '/^after/{N;s/.*(\nafter)/(no information)\1/;P;D}'
This will also work in cases where there are more than 2 "problematic" lines in a row and will replace all of them with (no information)
.
Upvotes: 2
Reputation: 2794
Are the duplicate lines next to each other? If so, just run the file through uniq first
Upvotes: 0