Reputation: 14219
I have an html file which has a structure like the following:
<div>
</div
<div>
</div>
<div>
<div>
</div>
<div>
</div>
<div>
</div>
<div>
<div>
<div>
</div>
</div>
I would like to select all the siblings div without selecting nested div in the third and fourth block. If I use find_all()
I get all the divs.
Upvotes: 4
Views: 8835
Reputation: 1123740
You can find direct children of the parent element:
soup.select('body > div')
to get all div
elements under the top-level body
tag.
You could also find the first div
, then grab all matching siblings with Element.find_next_siblings()
:
first_div = soup.find('div')
all_divs = [first_div] + first_div.find_next_siblings('div')
Or you could use the element.children
generator and filter those:
all_divs = (elem for elem in top_level.children if getattr(elem, 'name', None) == 'div')
where top_level
is the element containing these div
elements directly.
Upvotes: 8