python - beautifulsoup - removing a line of code

Question

I started to learn the beautifulsoup. I am trying to remove from html script a line of code containing .

The most examples in the documentation are presented for the whole tags (opening and closing part).
Is it possible to modify just one part of a tag? For example:


Hello
foo!

how to remove just the first line of the code?

chickity china chinese chicken · Accepted Answer

You can use BeautifulSoup's unwrap() to specify the invalid tag, which will only remove the extra tags that don't have a open/close counterpart, while retaining others:

soup = BeautifulSoup(html_doc, 'html.parser')

invalid_tags = ['']

for tag in invalid_tags: 
    for match in soup.findAll(tag):
        match.unwrap()

print(soup)

result:

Hello
foo!

python - beautifulsoup - removing a line of code

Answers (2)

Related Questions