Marvin
Marvin

Reputation: 57

Python Beautifulsoup : how to find a tag by attribute value without knowing corresponding attribute name?

Let's assume we have an Attribute value "xyz" without knowing the Attribute Name. It means we could match

    <a href="xyz">

but also

    <div class="xyz">

Is it possible search for such tags?

Upvotes: 0

Views: 233

Answers (2)

MarianD
MarianD

Reputation: 14141

[tag for tag in soup.find_all(True) 
    if "xyz" in tag.attrs.values() or ["xyz"] in tag.attrs.values()]

The explanation:

  • soup.find_all(True) finds all tags (because True is for every tag evaluated to True).

  • tag.attrs is the dictionary of all attributes of the tag.

  • We are not interested in tag attributes names (as href, class, id), only in their values - so we use tag.attrs.values().
  • Some attributes are multi-valued (e.g. class="x y"), so their value in the attrs dictionary is a list (e.g. ["x", "y"]). So we test both "xyz" and ["xyz"] possibilities.

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195448

One solution is using lambda in find_all function.

Example:

data = '''<a href="xyz">a</a>
<div class="somethingelse">b</div>
<div class="xyz">c</div>'''

from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'html.parser')

for tag in soup.find_all(lambda tag: any('xyz' in tag[a] for a in tag.attrs)):
    print(tag)

Prints:

<a href="xyz">a</a>
<div class="xyz">c</div>

Upvotes: 3

Related Questions