beautifulsoup: Dropping text inside tags

Question

I am trying to extract strings from a html file using beautifulsoup. A query replies with label tags inside them, how can I get rid of those tags.

from bs4 import BeautifulSoup
import requests

with open('/Desktop/filename.html') as html_file:
    soup = BeautifulSoup(html_file, 'lxml')

string = soup.find('div', class_="col-sm-8 col-xs-6")
print(string)

Output-


    Sherlock Holmes 

    
        Detective's Address
    
    221B Baker Street London 

    
        City, State, Zip
    
    London, United Kingdom

print(string.text) outputs

    Sherlock Holmes
    Detective's Address
    221B Baker Street London
    City, State, Zip
    London, United Kingdom

I am not interested in the text inside the tags, how can I get rid of them so that the output is-

    Sherlock Holmes
    221B Baker Street London
    London, United Kingdom

Cr4id3r · Accepted Answer

You can try with decompose, example, before the print use this:

for label_element in string.find_all("label"):
    label_element.decompose()

beautifulsoup: Dropping text inside tags

Answers (1)

Related Questions