Reputation: 138
I'm trying to scrape a website, here's the HTML code
<h2>Information</h2>
<div>
<span class="dark_text">Type:</span>
<a href="https://myanimelist.net/topanime.php?type=tv">TV</a>
</div>
<div class="spaceit">
<span class="dark_text">Episodes:</span>
12
</div>
<div class="spaceit">
<span class="dark_text">Duration:</span>
25 min. per ep.
</div>
and I'm trying to get Episodes:
& 12
and Duration:
& 25 min. per ep.
and a lot more like this in the full html code.
I wanted these values as string
my python code is
page_soup = soup(page_html, "html.parser")
spaceit = page_soup.findAll("div",{"class": "spaceit"})
I'm unable to figure out how to find the values of span
and div
Upvotes: 0
Views: 1281
Reputation: 12499
Use select then run for loop
Example
from bs4 import BeautifulSoup
html = '<h2>Information</h2>' \
'<div>' \
'<span class="dark_text">Type:</span>' \
'<a href="https://myanimelist.net/topanime.php?type=tv">TV</a>' \
'</div>' \
'<div class="spaceit">' \
'<span class="dark_text">Episodes:</span>12</div>' \
'<div class="spaceit">' \
'<span class="dark_text">Duration:</span>25 min. per ep.</div> '
page_soup = BeautifulSoup(html, features="lxml")
elements = page_soup.select('div.spaceit')
for element in elements:
print(element.get_text())
Upvotes: 2