Reputation: 362
Given data in a text file:
string1 EP00 37.45 83.83
save
save
save
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
string2
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
gibberish
I would like to use sed
or awk
to match both string1 and string 2, then delete everything after string1
and the first 3 lines. I would like to it to also delete string2
, but not string1
. And also delete one extra line in between that and the next text. So the expected output would be:
string1 EP00 37.45 83.83
save
save
save
There are always the same number of lines in between the two patterns if that helps (16). I would like to do this with sed
or awk
, but have only been able to figure out a script to delete the entire block of data between the two, holding onto both strings:
sed '/string1/,/string2/{//!d}' file >> tr.txt
Does anyone know how to specify to retain string1
and the three lines after it and delete the rest of the lines in between the two patterns including string2
? I would like to do this with sed
or awk
, whichever is easier.
Thanks!
Upvotes: 3
Views: 892
Reputation: 58478
This might work for you (GNU sed):
sed -rn '/string1/{h;d};H;/string2/{x;s/(string1([^\n]*\n){4}).*string2.*/\1/p}' file
Upvotes: 0
Reputation: 45293
Using GNU sed
sed -n '/^string1/,+3p' file
If no GNU sed, try this:
sed -n ':a;/string1/{N;N;N;p;ta;}' file
Upvotes: 0
Reputation: 46415
If you want to do this with awk
, the script might look something like this (updated based on your comments; it now "recycles", so it will do the matching correctly for as many times as you have the string1-string2 pattern. I realize you have already got an answer you accepted but wanted to give you this alternative; it is much less "professional" than @anubhava's answer, but it might give you an insight in how to make awk
do "anything you want", even if you are not a pro):
BEGIN {
state = 0;
}
{ if($1 == "string1") {
state = 1;
}
if (state == 1) {
state = 2;
print;
next;
}
if (state > 1 && state < 5) {
print;
state = state + 1;
next;
}
if ($1 == "string2") {
state = 6;
next;
}
if (state == 6) {
state = 0;
next;
}
if (state == 0) {
print;
next;
}
}
The state
variable basically tells you "where am I in the logic". The states are:
0: "normal state", print the line, go to the next
1: "found string2", start printing this line and the next three
2 - 4: printing "the lines that followed string1"
5: Waiting for string2, not printing anything
6: found string2, need to delete the next line
Having found the next line, we reset the state to 0 again…
You would run it with
awk -f scriptFile.awk inputfile.txt > outputfile.txt
I made this "pedestrian", so you can see exactly what is done, and in what order. Let me know if you have any questions.
Upvotes: 2
Reputation: 785631
You can use this awk:
awk '/^string1/{i=0} /^string1/,/^string2/{i++; if (i<5) print; next}1' file
string1 EP00 37.45 83.83
save
save
save
Upvotes: 5
Reputation: 62459
Something like this:
sed -e '1,/^string1/-1d' -e '/string1/+4,$/d' < file > output
The first command removed from line 1 up to the line preceeding a line starting with "string1", and the second finds the line starting with "string1", counts 4 lines after that, and deletes from there to the end.
You could also do this, if your version of grep
supports it:
grep -A3 "^string1" file > output
Upvotes: 0