Reputation: 870
I got a file with a schema like this:
172.18.0.7
172.18.0.9
172.18.0.8
172.18.0.7
172.18.0.9
172.18.0.8
172.18.0.7
172.18.0.9
172.18.0.8
172.18.0.7
172.18.0.9
172.18.0.8
So its 7->9->8->7->9->8->7->9->8->7->9->8->7->9->8 ... at its end.
I want to get the lines where this schema is different. E.g. 7->8->9
:
172.18.0.7
172.18.0.8
172.18.0.9
As the file got something about 100000 lines I'd like to use grep to filter them.
I tried something like this:
grep -Pzl "172.18.0.7*\n 172.18.0.9*\n 172.18.0.8*\n"
which did not work out properly. I wanted to find a pattern which fits the schema mentioned first.
Upvotes: 0
Views: 83
Reputation: 20002
GNU sed 4.2 supports -z
:
sed -z 's/172.18.0.7\n172.18.0.9\n172.18.0.8\n//g' file
This solution will fail when you the first line of a potential set of three is like
some_other_chars_before_172.18.0.7
When you add \n
in the beginning of the match, you need to remove the last \n
to find 2 sets without something in between, but that would allow the last line ending with
172.18.0.8_and_more_characters
It looks like a deadlock, but you can modify your input to Windows style and match smart:
sed -rz 's/\n/\r\n/g;s/(\n|^)172.18.0.7\r\n172.18.0.9\r\n172.18.0.8\r//g;s/\r//g' file
Upvotes: 0
Reputation: 13249
Using GNU awk:
awk -v RS='\n*[0-9.]+7\n[0-9.]+9\n[0-9.]+8\n' NF file
The record separator RS
is set such that it matches 3 lines having digits and dots and that finishing respectively with 7
, 9
, 8
(in this order).
The output record ORS
separator being (the default one) \n
, the input script (only NF
) prints all non empty lines (that don't match RS
).
Upvotes: 2