Reputation: 119
I have some html scraping code issues with beautiful soup. I cannot figure out how to go through the whole html document to find the rest of the things I am looking for.
I have this code that will find and print the word "Totem" in the below html. I want to be able to cycle through the html and find the remaining "One, Two, Three", and "Rent"
Code that works to find the first tag and text:
print(html.find('td', {'class': 'play'}).next_sibling.next_sibling.text)
Let the below be the sample html to scrape:
<tr>
<td class="play">
<a href="#" class="audio-preview"><span class="play-button as_audio-button"></span></a>
<audio class="as_audio_preview" src="https://shopify.audiosalad.com/" >foo</audio>
</td>
**<td>Totem</td>**
<!--<td>$0.99</td>-->
<td class="buy">
<tr>
<td class="play">
<a href="#" class="audio-preview"><span class="play-button as_audio-button"></span></a>
<audio class="as_audio_preview" src="https://shopify.audiosalad.com/" >foo</audio>
</td>
**<td>One, Two, Three</td>**
<!--<td>$0.99</td>-->
<td class="buy">
<tr>
<td class="play">
<a href="#" class="audio-preview"><span class="play-button as_audio-button"></span></a>
<audio class="as_audio_preview" src="https://shopify.audiosalad.com/" >foo</audio>
</td>
**<td>Rent</td>**
<!--<td>$0.99</td>-->
<td class="buy">
Upvotes: 3
Views: 3604
Reputation: 22440
Try this. It should fetch you the content you are after:
from bs4 import BeautifulSoup
soup = BeautifulSoup(content,"lxml")
for items in soup.find_all(class_="play"):
data = items.find_next_sibling().text
print(data)
Or, you can try like this as well:
for items in soup.find_all(class_="play"):
data = items.find_next("td").text
print(data)
Output:
Totem
One, Two, Three
Rent
Upvotes: 1
Reputation: 703
you have to iterate over elements, like this:
for td in html.find_all('td', {'class': 'play'}):
print(td.next_sibling.next_sibling.text)
Upvotes: 0