MERose
MERose

Reputation: 4421

Get content of tags with empty id in BeautifulSoup

from bs4 import BeautifulSoup

page = """<span id="something">useless</span>
          <span id="">some text</span>
          <span id="different">useless</span>"""
soup = BeautifulSoup(page)

How can I get some text only? Using soup.find_all('span', {'id': ""}) finds everything.

Upvotes: 1

Views: 829

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1124558

You have two options:

  1. use a custom filter; pass in a function and it'll be asked to return True or False for elements:

    soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
    
  2. Use a CSS selector, with an exact attribute match:

    soup.select('span[id=""]')
    

Demo:

>>> from bs4 import BeautifulSoup
>>> page = """<span id="something">useless</span>
...           <span id="">some text</span>
...           <span id="different">useless</span>"""
>>> soup = BeautifulSoup(page)
>>> soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
[<span id="">some text</span>]
>>> soup.select('span[id=""]')
[<span id="">some text</span>]

Upvotes: 2

Related Questions