Vishnu
Vishnu

Reputation: 324

Scrape Data from Website using css style Using Beautifull soup

I have a website where from where i want to scrape coupon codes.I have two issues here.Am using python and beautifull soup here. 1)Some coupons displayed in span tag doesnt have class or id,so am not able to get coupons from these tags.i need to get from strong tag(AXISCB50)

<h6><span style="color: #808000">25% Cashback on Recharges :</span></h6>
<ul>
<li>Get 25% Cashback upto Rs.25 per transaction.</li>
<li>Coupon Code : <span style="color: #ff0000"><strong>AXISCB50</strong></span></li>
<li>Maximum 2 transaction per Debit/Credit card.</li>
</ul>

Is it possible to scrape by specifying style="color: #808000 something like this(style).

2)Some coupons are displayed via ajax which is displayed only once we click the button.How will i scrape_ these data which is displayed via script?

Am into webscraping for the first time.Any help is appreciated and thanks in advance.

Upvotes: 1

Views: 309

Answers (1)

alecxe
alecxe

Reputation: 473873

To get the coupon code, I would not rely on the color style attribute. Instead, get the next element to the Coupon Code text:

soup.find(text=lambda x: x and x.startswith('Coupon Code')).next_element.text

Demo:

>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <h6><span style="color: #808000">25% Cashback on Recharges :</span></h6>
... <ul>
... <li>Get 25% Cashback upto Rs.25 per transaction.</li>
... <li>Coupon Code : <span style="color: #ff0000"><strong>AXISCB50</strong></span></li>
... <li>Maximum 2 transaction per Debit/Credit card.</li>
... </ul>
... """
>>> 
>>> soup = BeautifulSoup(data)
>>> 
>>> print soup.find(text=lambda x: x.startswith('Coupon Code')).next_element.text
AXISCB50

Some coupons are displayed via ajax which is displayed only once we click the button.How will i scrape_ these data which is displayed via script?

You would need to research what requests are sent when you click a button. Use browser developer tools, Network tab. Then, simulate the request(s) in your python code. requests is usually a good choice.

Upvotes: 2

Related Questions