user1948374
user1948374

Reputation: 51

How do you remove all the lines between two html tags using sed (or similar)?

I have a file that looks like this:

<HTML>
<HEAD>
< ... stuff ... ></HEAD>
< ... stuff ... >
</HTML>

I'm trying to remove everything between, and including, the HEAD tags, but can't seem to get it to work.

I thought

sed -i -e 's/<HEAD>.*<\/HEAD>//g' file.HTML

should work, but it doesn't remove anything.

sed -i -e '/<HEAD>/,/<\/HEAD>/d' file.HTML

doesn't do anything either. No errors, just nothing.

Is there something wrong with my input file, or is there a different way to go about it?

Upvotes: 4

Views: 7444

Answers (1)

David C. Rankin
David C. Rankin

Reputation: 84541

Delete all lines between tags leaving tags:

sed '/<tag>/,/<\/tag>/{//!d}' input.txt

Delete all lines between tags including tags:

sed '/<tag>/,/<\/tag>/d' input.txt

To change in place use sed -i .... To change in place while backing up original sed -i.bak ... which will save the original as input.txt.bak.

Upvotes: 14

Related Questions