Reputation: 2324
Primary question
I know how to use find_all() to retrieve elements that have an attribute with a specific value, but I can't find any examples of how to retrieve elements that have attributes with one of several acceptable values. In my case I'm working with DITA XML and I want to retrieve topicref elements where the scope attribute is one of the following:
I wrote a custom function that works, but there must be a smarter way to do this with the functions that are already present in BeautifulSoup. Here is my code:
from bs4 import BeautifulSoup
with open("./dita_source/envvariables.ditamap","r") as file:
doc = BeautifulSoup(file)
file.close()
def isLocal(element):
if (element.name == "topicref"):
if (not element.has_attr("scope") or element["scope"] == "local" or element["scope"] == "peer"):
return True;
return False;
topicrefs = doc.find_all(isLocal)
Secondary question
Is there a way to use find_all() with both its standard filters as well as a custom function? I tried doc.find_all("topicref", isLocal)
, but that didn't work. I had to add the extra if (element.name == "topicref"):
statement to my custom function instead.
Upvotes: 2
Views: 1301
Reputation: 368904
Specify topicref
as the first argument (name
) and pass a function for scope
keyword argument:
def isLocal(scope):
return scope in (None, "local", "peer")
topicrefs = soup.find_all('topicref', scope=isLocal)
Or using lambda
:
topicref = soup.find_all(
'topicref',
scope=lambda scope: scope in (None, "local", "peer")
)
Upvotes: 0
Reputation: 59118
You can supply a list as the value of an attribute parameter to find_all()
, and it will return elements where the attribute matches any of the items in that list:
>>> soup.find_all(scope=["row", "col"])
[
<th scope="col">US $</th>,
<th scope="col">Euro</th>,
<th scope="row">Mon – Fri</th>,
<th scope="row">Sat – Sun</th>,
]
... but there's no way to specify "attribute doesn't exist at all" in that list (neither None
nor an empty string work). So for that, you do need a function.
Upvotes: 1