user11703450
user11703450

Reputation:

How to extract data(text) using beautiful soup when they are in the same class?

I'm working on a personal project where I scrape data from a website. I'm trying to use beautiful soup to do this but I came across data in the same class but a different attribute. For example:

<div class="pi--secondary-price">
<span class="pi--price">$11.99 /<abbr title="Kilogram">kg</abbr></span>
<span class="pi--price">$5.44 /<abbr title="Pound">lb.</abbr></span>
</div>

How do I just get $11.99/kg? Right now I'm getting $11.99 /kg $5.44 /lb.

I've done x.select('.pi--secondary-price') but it returns both prices. How do I only get 1 price ($11.99 /kg)?

Upvotes: 0

Views: 119

Answers (1)

trotta
trotta

Reputation: 1226

You could first get the <abbr> tag and then search for the respective parent tag. Like this:

from bs4 import BeautifulSoup

html = '''
<div class="pi--secondary-price">
<span class="pi--price">$11.99 /<abbr title="Kilogram">kg</abbr></span>
<span class="pi--price">$5.44 /<abbr title="Pound">lb.</abbr></span>
</div>
'''  

soup = BeautifulSoup(html, 'html.parser')

kg = soup.find(title="Kilogram")
print(kg.parent.text)

This gives you the desired output $11.99 /kg. For more information, see the BeautifulSoup documentation.

Upvotes: 2

Related Questions