Reputation: 19697
I would like to use the find_all
function of BeautifulSoup to retrieve all <li>
tag but also their parent.
<div name="div1">
<li>Test 1</li>
<li>Test 2</li>
</div>
If I try with this code:
tags = soup.find_all("li")
print tags[0].parent
This will print:
<div name="div1">
<li>Test 1</li>
<li>Test 2</li>
</div>
Because the parent contains the two <li>
tags.
What I expect is:
<div name="div1">
<li>Test 1</li>
</div>
How to solve this issue please?
Upvotes: 0
Views: 2397
Reputation: 2177
You can achieve what you supposedly want to by replicating the parent for each list element and wrapping the element in it:
from bs4 import BeautifulSoup
txt = """<div name="div1">
<li>Test 1</li>
<li>Test 2</li>
</div>"""
def clone(soup, tag):
newtag = soup.new_tag(tag.name)
for attr in tag.attrs:
newtag[attr] = tag[attr]
return newtag
soup = BeautifulSoup(txt)
tags = soup.find_all("li")
for tag in tags:
print tag.wrap(clone(soup, tag.parent))
Upvotes: 2