JOHANNES_NYÅTT
JOHANNES_NYÅTT

Reputation: 3425

Beautiful soup check for tag in tag

I'm using Beautiful Soup 4 to scrape a page. There's a block of text I don't want:

<p class="MsoNormal" style="text-align: center"><b>
                            <span lang="EN-US" style="font-family: Arial; color: blue">
                            <font size="4">1 </font></span>
                            <span lang="AR-SA" dir="RTL" style="font-family: Arial; color: blue">
                            <font size="4">&#1600;</font></span><span lang="EN-US" style="font-family: Arial; color: blue"><font size="4"> 
                            с&#1199;р&#1241; фати&#1211;&#1241;</font></span></b></p>

The thing that makes it unique is that it has a tag. I already used findall() to get all the

tags. So now I have a for loop like:

for el in doc.findall('p'):
    if el.hasChildTag('b'):
        break;

Unfortunately bs4 has no "hasChildTag" function

Upvotes: 2

Views: 3067

Answers (2)

root
root

Reputation: 80346

for elem in soup.findAll('p'):
    if elem.findChildren('b'):
        continue #skip the elem with "b", and continue with the loop
    #do stuff with the elem

Upvotes: 2

Joe
Joe

Reputation: 3059

It should be possible to use css selectors also.

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors

soup.select("p b")

Upvotes: 3

Related Questions