étale-cohomology
étale-cohomology

Reputation: 1861

Beautiful Soup: Test if a div is children of a div

Is it possible to test with Beautiful Soup whether a div is a (not necessarily immediate) child of a div?

Eg.

<div class='a'>
  <div class='aa'>
    <div class='aaa'>
      <div class='aaaa'>
      </div>
    </div>
  </div>
  <div class='ab'>
    <div class='aba'>
      <div class='abaa'>
      </div>
    </div>
  </div>
</div>

Now I want to test whether the div with class aaaa and the div with class abaa are (not necessarily immediate) children of the div with class aa.

import bs4

with open('test.html','r') as i_file:
  soup = bs4.BeautifulSoup(i_file.read(), 'lxml')
div0 = soup.find('div', {'class':'aa'})
div1 = soup.find('div', {'class':'aaaa'})
div2 = soup.find('div', {'class':'abaa'})

print(div1 in div0)  # must return True, but returns False
print(div2 in div0)  # must return False

How can this be done?

(Of course, the actual HTML is more complicated, with more nested divs.)

Upvotes: 0

Views: 681

Answers (3)

&#233;tale-cohomology
&#233;tale-cohomology

Reputation: 1861

Okay, I think I found a way. You gotta get all children divs of the parent div with find_all:

import bs4

with open('test.html','r') as i_file:
  soup = bs4.BeautifulSoup(i_file.read(), 'lxml')

div0 = soup.find('div', {'class':'aa'})
div1 = soup.find('div', {'class':'aaaa'})
div2 = soup.find('div', {'class':'abaa'})

children = div0.find_all('div')
print(div1 in children)
print(div2 in children)

Upvotes: 1

niranjan94
niranjan94

Reputation: 826

You can use find_parent method from Beautifulsoup.

import bs4

with open("test.html", "r") as i_file:
    soup = bs4.BeautifulSoup(i_file.read(), "lxml")

div0 = soup.find("div", {"class": "aa"})
div1 = soup.find("div", {"class": "aaaa"})
div2 = soup.find("div", {"class": "abaa"})


print(div1.find_parent(div0.name, attrs=div0.attrs) is not None)  # Returns True
print(div2.find_parent(div0.name, attrs=div0.attrs) is not None)  # Returns False

Upvotes: 2

sushanth
sushanth

Reputation: 8302

try finding all the child elements using find_all_next and see if the child elements has the required class attribute.

from bs4 import BeautifulSoup

soup = BeautifulSoup(text, "html.parser")


def is_child(element, parent_class, child_class):
    return any(
        child_class in i.attrs['class']
        for i in soup.find("div", attrs={"class": parent_class}).find_all_next(element)
    )


print(is_child("div", "aa", "aaa"))  # True
print(is_child("div", "abaa", "aa"))  # False

Upvotes: 1

Related Questions