Reputation: 2870
I have a html in which I want to remove element <div> <div style
. I tried below code, but to no avail
for d in soup.select('div > div.style'):
d.extract()
Could you please elaborate on how to do so?
from bs4 import BeautifulSoup
texte = """
<div class="content mw-parser-output" id="bodyContent">
<div>
<div style="clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top: dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;">
This article is issued from <a class="external text" href="https://en.wiktionary.org/wiki/?title=Love&oldid=60218267" title="Last edited on 2020-09-02">Wiktionary</a>. The text is licensed under <a class="external text" href="https://creativecommons.org/licenses/by-sa/4.0/">Creative Commons - Attribution - Sharealike</a>. Additional terms may apply for the media files.
</div>
</div>
</div>
"""
soup = BeautifulSoup(texte, 'html.parser')
for d in soup.select('div > div.style'):
d.extract()
print(soup.prettify())
My expected result is
<div class="content mw-parser-output" id="bodyContent">
</div>
Upvotes: 0
Views: 311
Reputation: 71
Here
soup = BeautifulSoup(texte, 'html.parser')
soup.select_one('div > div#content').parent.extract()
print(soup.prettify())
Upvotes: 2
Reputation: 2868
you can try with decompose method, Deleting a div with a particlular class using BeautifulSoup
for div in soup.findAll("div", {'style':'clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top: dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;'}):
div.decompose()
print(soup)
<div class="content mw-parser-output" id="bodyContent">
</div>
Upvotes: 1
Reputation: 313
from bs4 import BeautifulSoup
texte = """
<div class="content mw-parser-output" id="bodyContent">
<div>
<div style="clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top:
dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;">
This article is issued from <a class="external text" href="https://en.wiktionary.org/wiki/?ti
tle=Love&oldid=60218267" title="Last edited on 2020-09-02">Wiktionary</a>. The text is licensed u
nder <a class="external text" href="https://creativecommons.org/licenses/by-sa/4.0/">Creative Commons
- Attribution - Sharealike</a>. Additional terms may apply for the media files.
</div>
</div>
</div>
"""
soup = BeautifulSoup(texte, 'html.parser')
for d in soup.find_all('div', {"style": True}):
d.find_parent("div").extract()
print(soup.prettify())
Upvotes: 1