Khodeir
Khodeir

Reputation: 483

BeautifulSoup order of occurrence of Tags

Consider the following situation:

tag1 = soup.find(**data_attrs)
tag2 = soup.find(**delim_attrs)

Is there a way to find out which tag occurred "first" in the page?

Clarifications:

Upvotes: 1

Views: 2684

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1124278

BeautifulSoup tags don't track their order in the page, no. You'd have to loop over all tags again and find your two tags in that list.

Using the standard sample BeautifulSoup tree:

>>> tag1 = soup.find(id='link1')
>>> tag2 = soup.find(id='link2')
>>> tag1, tag2
(<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>)
>>> all_tags = soup.find_all(True)
>>> all_tags.index(tag1)
6
>>> all_tags.index(tag2)
7

I'd use a tag.find_all() with a function to match both tag types instead; that way you get a list of the tags and can see their relative order:

tag_match = lambda el: (
    getattr(el, 'name', None) in ('tagname1', 'tagname2') and
    el.attrs.get('attributename') == 'something' and 
    'classname' in el.attrs.get('class')
)
tags = soup.find(tag_match)

or you can use the .next_siblings iterator to loop over all elements in the same parent and see if the delimiter comes next, etc.

Upvotes: 4

Related Questions