Reputation: 659
I would like to remove a portion of an HTML that contains a specific string before saving it. The tag contains a person's Name and I would like to remove the entire tag to make it anonymous.
The HTML is:
<div id="top-card" data-li-template="top_card">...</div>
and all its children.
I explored using beautifulsoup but could not find a solution.
Is there a way that I can just remove the entire portion
of the HTML while keeping the style intact?
Thanks!
Upvotes: 0
Views: 123
Reputation: 4971
You can use .extract()
to remove elements from using BeautifulSoup
.
Assuming you want to remove the div whose id is "top-card":
>>> html = """
... <div id="top-card" data-li-template="top_card"><div>test</div></div>
... <div>test</div> <div id="foo">blah</div>"""
>>> soup = BeautifulSoup(html)
>>> [div.extract() for div in soup("div",id="top-card")]
[<div data-li-template="top_card" id="top-card"><div>test</div></div>]
>>> soup
<html><body>
<div>test</div> <div id="foo">blah</div></body></html>
Upvotes: 1