Reputation: 4421
from bs4 import BeautifulSoup
page = """<span id="something">useless</span>
<span id="">some text</span>
<span id="different">useless</span>"""
soup = BeautifulSoup(page)
How can I get some text
only? Using soup.find_all('span', {'id': ""})
finds everything.
Upvotes: 1
Views: 829
Reputation: 1124558
You have two options:
use a custom filter; pass in a function and it'll be asked to return True
or False
for elements:
soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
Use a CSS selector, with an exact attribute match:
soup.select('span[id=""]')
Demo:
>>> from bs4 import BeautifulSoup
>>> page = """<span id="something">useless</span>
... <span id="">some text</span>
... <span id="different">useless</span>"""
>>> soup = BeautifulSoup(page)
>>> soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
[<span id="">some text</span>]
>>> soup.select('span[id=""]')
[<span id="">some text</span>]
Upvotes: 2