Reputation: 316
I am looking for a way to delete (with sed if possible) an html tag containing a specific word. For instance, delete every div tag containing the word foo. The divs can of course contain multiple lines. For instance :
<body>
<div>
This div will be <i>deleted</i>.
Why ?
Because it contains foo.
</div>
<div>
This div doesn't contains the forbidden word.
<b>So it won't be deleted.</b>
</div>
</body>
I found ways to delete html tags, but nothing about tags containing a specific text. Thanks !
Upvotes: 0
Views: 883
Reputation: 21
It is not possible with sed alone. Sed is a single-line processor. If you want a script using sed/bash/grep, you would need to create a parser that will parse div contents and only print the divs that don't contain the text you wanted. Seriously, look for a html parser instead.
Upvotes: 2