Reputation: 21
My problem may be trivial because I am new to web scraping. Please see following HTML code:
<div class="price-container clearfix">
<span class="sale-flag-percent">-40%</span>
<span class="price-box ri">
<span class="price "><span data-currency-iso="PKR">Rs.</span>
<span dir="ltr" data-price="5999"> 5,999</span> </span>
<span class="price -old "><span data-currency-iso="PKR">Rs.</span>
<span dir="ltr" data-price="9999"> 9,999</span> </span> </span>
</div>
So far I am able to get access to outermost div.price-container clearfix
. But I am not able to get inner spans and get the price of the product. Any way to gain access to inner span and get prices.
Upvotes: 0
Views: 1624
Reputation: 15376
Given the html in your question it should be quite easy to select the span tags using CSS selectors.
An example,
from bs4 import BeautifulSoup
html = '''
<div class="price-container clearfix">
<span class="sale-flag-percent">-40%</span>
<span class="price-box ri">
<span class="price "><span data-currency-iso="PKR">Rs.</span>
<span dir="ltr" data-price="5999"> 5,999</span> </span>
<span class="price -old "><span data-currency-iso="PKR">Rs.</span>
<span dir="ltr" data-price="9999"> 9,999</span> </span> </span>
</div>
'''
soup = BeautifulSoup(html, 'html.parser')
tags = soup.select('div.price-container.clearfix span[data-price]')
prices = [i.text.strip() for i in tags]
print(prices)
This expression:
div.price-container.clearfix span[data-price]
selects all 'span' tags that have a 'data-price' attribute, if they are descendants of a 'div' tag that has 'price-container' and 'clearfix' class attributes.
The result is a list with the text of both span tags. If you want a different selector for each tag, you could use the span.price
and span.price.-old
parent tags.
new_prices = soup.select('span[class="price "] span[data-price]')
old_prices = soup.select('span[class="price -old "] span[data-price]')
This will result in two lists of tags, one for each price category.
Upvotes: 1