Reputation: 119
I'm pretty familiar with sed but I don't know awk very well, and I'm not sure how to solve this problem. I've googled for a while but no luck so far. Here's the situation: I've got a big file with groups and sections, like so:
<A1>
some nr of lines
</A1>
<A2>
some nr
of lines
</A2>
<B1>
some
nr of
lines
</B1>
<B2>
some nr of lines
</B2>
<B3>
bla
</B3>
<C1>
bla
</C1>
<C2>
bla
</C2>
Now the problem is that the number of groups can change, the number of sections can change, and the number of lines in each section can change. For example, section A might go to 25, section B might go to 8, and so on. What I need to do is remove all entries of certain groups, in the example above I'd like to remove everything in <B*>
, leaving me with the following:
<A1>
some nr of lines
</A1>
<A2>
some nr
of lines
</A2>
<C1>
bla
</C1>
<C2>
bla
</C2>
Additionally, there would be several sections I would want to remove (although these can be in separate runs), for example if the file goes from A1 to R123, I'd want to remove B*, F*, M*, etc.
If something similar has already been asked and answered somewhere I apologize, I did try to find a solution before posting.
Thanks!
Upvotes: 2
Views: 1191
Reputation: 204721
I think what you're looking for is something like this:
awk -v rmv="AC" 'BEGIN{
gsub(/./,"|&",rmv)
sub(/$/,")[0-9]+>$",rmv)
start = end = rmv
sub(/^\|/,"^<(",start)
sub(/^\|/,"^</(",end)
}
$0 ~ start { f=1 }
!f
$0 ~ end { f=0 }
' file
Just populate the "rmv" variable with the list of all the sections you want removed:
$ awk -v rmv="B" '...'
<A1>
some nr of lines
</A1>
<A2>
some nr
of lines
</A2>
<C1>
bla
</C1>
<C2>
bla
</C2>
$ awk -v rmv="AC" '...'
<B1>
some
nr of
lines
</B1>
<B2>
some nr of lines
</B2>
<B3>
bla
</B3>
$
Upvotes: 1
Reputation: 786339
Using sed:
sed '/<B1>/,/<\/B3>/d' infile
Which means find a range of text starting from <B1>
and ending at </B3>
and delete it from sed's output. (that means sed will print rest of file on stdout)
EDIT: This will also work for your case:
sed '/<B[0-9]*>/,/<\/B[0-9]*>/d'
Upvotes: 6