Diego
Diego

Reputation: 659

Remove portion of html (tag) keeping style - python

I would like to remove a portion of an HTML that contains a specific string before saving it. The tag contains a person's Name and I would like to remove the entire tag to make it anonymous.

The HTML is:

<div id="top-card" data-li-template="top_card">...</div>

and all its children.

I explored using beautifulsoup but could not find a solution.

Is there a way that I can just remove the entire portion of the HTML while keeping the style intact?

Thanks!

Upvotes: 0

Views: 123

Answers (1)

Sudipta
Sudipta

Reputation: 4971

You can use .extract()to remove elements from using BeautifulSoup.

Assuming you want to remove the div whose id is "top-card":

>>> html = """
... <div id="top-card" data-li-template="top_card"><div>test</div></div>
... <div>test</div> <div id="foo">blah</div>"""
>>> soup = BeautifulSoup(html)
>>> [div.extract() for div in soup("div",id="top-card")]
[<div data-li-template="top_card" id="top-card"><div>test</div></div>]
>>> soup
<html><body>
<div>test</div> <div id="foo">blah</div></body></html>

Upvotes: 1

Related Questions