Reputation: 23
I have the following data:
<li>
<div>Content1</div>
</li>
<li>
<div>Content2</div>
<div>Content3</div>
<div>Content4</div>
</li>
<li>
<div>Content5</div>
<div>Content6</div>
</li>
I want to put the content of each li-element in seperate list with BeautifulSoup. This should be the result:
List1 = ['Content1']
List2 = ['Content2', 'Content3', 'Content4']
List2 = ['Content5', 'Content6']
a line like div = [a.get_text(strip=True) for a in soup.select('li>div')]
puts the whole content in one list. I struggle to create seperate lists for each li-element and fill it with the right content. Can someone help?
Upvotes: 2
Views: 52
Reputation: 8219
You just need to create a new list for each li
, like this:
divs = [[div.get_text(strip=True) for div in li.find_all("div")] for li in soup.select('li')]
Upvotes: 1
Reputation: 82785
You can use a nested list comprehension
Ex:
from bs4 import BeautifulSoup
html = """<ul>
<li>
<div>Content1</div>
</li>
<li>
<div>Content2</div>
<div>Content3</div>
<div>Content4</div>
</li>
<li>
<div>Content5</div>
<div>Content6</div>
</li>
</ul>"""
soup = BeautifulSoup(html, "html.parser")
print([[j.get_text(strip=True) for j in i.find_all("div")] for i in soup.find_all("li")])
Output:
[['Content1'], ['Content2', 'Content3', 'Content4'], ['Content5', 'Content6']]
Upvotes: 2