Reputation: 453
I have a text formatted like the following:
2020-05-02
apple
string
string
string
string
string
2020-05-03
pear
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string
Each group has 7
lines = Date, Fruit and then 5 strings.
I would like to delete groups of 7
lines from the file by supplying just the date and the fruit.
So if choose '2020-05-03'
and 'pear'
this would remove:
2020-05-03
pear
string
string
string
string
string
from the file, resulting in this:
2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string
The file contains thousands of lines, I need a command, probably using sed or awk to:
Search for date 2020-05-03
Check if string after date is pear
delete both lines and following 5
lines
I know i can delete with sed like sed s'/string//g'
, however i am not sure if i can delete multiple lines.
Note: Date followed by fruit is never repeated twice so
2020-05-02
pear
would only occur once in the file
How can i acheive this?
Upvotes: 1
Views: 897
Reputation: 58420
This might work for you (GNU sed):
sed '/2020-05-03/{:a;N;s/[^\n]*/&/7;Ta;/^[^\n]*\npear/d}' file
If a line contains 2020-05-03
gather up in total 7 lines and if the 2nd of these lines contains pear
delete them.
Upvotes: 0
Reputation: 785146
Using awk,
you may do this:
awk -v dt='2020-05-03' -v ft='pear' '$1==dt{p=NR} p && NR==p+1{del=($1==ft)}
del && NR<=p+6{next} 1' file
2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string
Explanation:
-v dt='2020-05-03' -v ft='pear'
: Supply 2 values to awk from command line$1==dt{p=NR}
: If we find a line with matching date then store line no in variable p
p && NR==p+1{del=($1==ft)}
: If p>0
and we are at next line then set a flag del
to 1
if we have matching fruit name otherwise set that flag to 0
.del && NR<=p+6{next}
: If flag del
is set then skip next 6 lines1
: Default action to print lineUpvotes: 3