Akira
Akira

Reputation: 2870

How to remove element "<div> <div style" from the soup?

I have a html in which I want to remove element <div> <div style. I tried below code, but to no avail

for d in soup.select('div > div.style'):
    d.extract()

Could you please elaborate on how to do so?

from bs4 import BeautifulSoup

texte = """
<div class="content mw-parser-output" id="bodyContent">
    <div>
        <div style="clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top: dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;">
        This article is issued from <a class="external text" href="https://en.wiktionary.org/wiki/?title=Love&amp;oldid=60218267" title="Last edited on 2020-09-02">Wiktionary</a>. The text is licensed under <a class="external text" href="https://creativecommons.org/licenses/by-sa/4.0/">Creative Commons - Attribution - Sharealike</a>. Additional terms may apply for the media files.
        </div>
    </div>
</div>
"""

soup = BeautifulSoup(texte, 'html.parser')

for d in soup.select('div > div.style'):
    d.extract()
    
print(soup.prettify())

My expected result is

<div class="content mw-parser-output" id="bodyContent">
</div>

Upvotes: 0

Views: 311

Answers (3)

Antony Phoenix
Antony Phoenix

Reputation: 71

Here

soup = BeautifulSoup(texte, 'html.parser')
soup.select_one('div > div#content').parent.extract()

print(soup.prettify())

Upvotes: 2

qaiser
qaiser

Reputation: 2868

you can try with decompose method, Deleting a div with a particlular class using BeautifulSoup

for div in soup.findAll("div", {'style':'clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top: dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;'}):
   div.decompose()

print(soup)
 <div class="content mw-parser-output" id="bodyContent">

</div>

Upvotes: 1

Koko Jumbo
Koko Jumbo

Reputation: 313

from bs4 import BeautifulSoup

texte = """
<div class="content mw-parser-output" id="bodyContent">
    <div>
       <div style="clear:both; background-image:linear-gradient(180deg, #E8E8E8, white); border-top:
 dashed 2px #AAAAAA; padding: 0.5em 0.5em 0.5em 0.5em; margin-top: 1em; direction: ltr;">
       This article is issued from <a class="external text" href="https://en.wiktionary.org/wiki/?ti
tle=Love&amp;oldid=60218267" title="Last edited on 2020-09-02">Wiktionary</a>. The text is licensed u
nder <a class="external text" href="https://creativecommons.org/licenses/by-sa/4.0/">Creative Commons
 - Attribution - Sharealike</a>. Additional terms may apply for the media files.
       </div>
    </div>
</div>
"""

soup = BeautifulSoup(texte, 'html.parser')

for d in soup.find_all('div', {"style": True}):
    d.find_parent("div").extract()

print(soup.prettify())

Upvotes: 1

Related Questions