Reputation: 145
I'm hoping to use the findParent() method in BeautifulSoup to find a particular tag's parent that has an id attribute. For example, consider the following sample XML:
<monograph>
<section id="1234">
<head>Test Heading</head>
<p>Here's a paragraph with some text in it.</p>
</section>
</monograph>
Assuming I've matched something in the paragraph, I'd like to use findParent to indiscriminately find the first parent up the tree with an id attribute. Something like:
for hit in monograph(text="paragraph with"):
containername = hit.findParent(re.compile([A-Za-z]+), {id}).name
However, the preceding code doesn't return any hits.
Upvotes: 1
Views: 2219
Reputation: 1124558
Use id=True
to match an element that has an id
attribute, regardless of the value of the attribute:
hit.find_parent(id=True)
Inversely, using id=False
would find the first parent element without an id
attribute.
Note that you should really use the lower_case_with_underscores style for BeautifulSoup methods; findParent
is the BeautifulSoup 3 spelling that has been deprecated.
Demo:
>>> from bs4 import BeautifulSoup
>>> sample = '''\
... <monograph>
... <section id="1234">
... <head>Test Heading</head>
... <p>Here's a paragraph with some text in it.</p>
... </section>
... </monograph>
... '''
>>> soup = BeautifulSoup(sample, 'xml')
>>> str(soup.p)
"<p>Here's a paragraph with some text in it.</p>"
>>> print(soup.p.find_parent(id=True).prettify())
<section id="1234">
<head>
Test Heading
</head>
<p>
Here's a paragraph with some text in it.
</p>
</section>
>>> print(soup.p.find_parent(id=False).prettify())
<monograph>
<section id="1234">
<head>
Test Heading
</head>
<p>
Here's a paragraph with some text in it.
</p>
</section>
</monograph>
Upvotes: 3