Unable to scrape promotion price details from website

Question

I am trying to scrape promotion price details from: https://www.fairprice.com.sg/product/magnolia-fresh-milk-1lt-13022014

Specifically, I'm trying to scrape the "Any 2 for $5.45, Save $1.55" bit of information. When I run the code below, it gives me a null return.

Using the same code on other products in the same website works though (e.g. https://www.fairprice.com.sg/product/kirei-kirei-hand-soap-rfl-moisturing-peach-200ml-12089153 )

Unsure what is causing the difference in behavior. Appreciate any advise on this issue.

import sys
import time
from bs4 import BeautifulSoup
import requests
import re
    
try:
    url = 'https://www.fairprice.com.sg/product/magnolia-fresh-milk-1lt-13022014'
    headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36 Edg/97.0.1072.69'}
    page=requests.get(url, headers=headers)
except Exception as e:
    error_type, error_obj, error_info = sys.exc_info()
    print ('ERROR FOR LINK:', url)
    print (error_type, 'Line:', error_info.tb_lineno)
    
time.sleep(2)
soup=BeautifulSoup(page.text,'html.parser')
linkpromo=soup.find_all('span',attrs={'class':'sc-1bsd7ul-1 eSToaS'},string=re.compile(r'Any'))   

print(linkpromo)

msenior_ · Accepted Answer

The data is loaded dynamically so the data you are looking for is not in the html returned. You can use requests-html package to render the page. See below sample

from requests_html import HTMLSession   

url = 'https://www.fairprice.com.sg/product/magnolia-fresh-milk-1lt-13022014'
session = HTMLSession()
r = session.get(url)
r.html.render(timeout=20)

linkpromo=r.html.xpath("//div[@data-testid='offer-details'][last()]/div/span/text()")

print(linkpromo)

You will get below output in the terminal:

['Any 2 for $5.45, Save $1.55']

Unable to scrape promotion price details from website

Answers (1)

Related Questions