bash (sed or awk preferred) to remove everything between first and last instance

Question

I'm pretty familiar with sed but I don't know awk very well, and I'm not sure how to solve this problem. I've googled for a while but no luck so far. Here's the situation: I've got a big file with groups and sections, like so:


  some nr of lines


  some nr
  of lines


  some
  nr of
  lines


  some nr of lines


  bla


  bla


  bla

Now the problem is that the number of groups can change, the number of sections can change, and the number of lines in each section can change. For example, section A might go to 25, section B might go to 8, and so on. What I need to do is remove all entries of certain groups, in the example above I'd like to remove everything in , leaving me with the following:


  some nr of lines


  some nr
  of lines


  bla


  bla

Additionally, there would be several sections I would want to remove (although these can be in separate runs), for example if the file goes from A1 to R123, I'd want to remove B*, F*, M*, etc.

If something similar has already been asked and answered somewhere I apologize, I did try to find a solution before posting.

Thanks!

anubhava · Accepted Answer

Using sed:

sed '//,/<\/B3>/d' infile

Which means find a range of text starting from and ending at and delete it from sed's output. (that means sed will print rest of file on stdout)

EDIT: This will also work for your case:

sed '//,/<\/B[0-9]*>/d'

bash (sed or awk preferred) to remove everything between first and last instance

Answers (2)

Related Questions