Reputation: 1
I was trying to extract profile name from the reviews from this link:https://www.amazon.in/Samsung-Midnight-Storage-6000mAh-Battery/dp/B0B4F52B5X/?_encoding=UTF8&pd_rd_w=4JKBg&content-id=amzn1.sym.e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_p=e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_r=NEBBF38XJRRBGK0BZBX3&pd_rd_wg=qFxtB&pd_rd_r=0f156162-4690-4ef5-9a8b-8b03e82e194b&ref_=pd_gw_ci_mcx_mr_hp_d&th=1
under span and class_="a-profile-name"
but when I tried to print it , It just returned an empty list
Below is my code:
from bs4 import BeautifulSoup as bs
import requests
link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'
soup =bs(requests.get(link).text,"html.parser")
name = soup.find_all("span",class_= "a-profile-name")
print(name)
Upvotes: 0
Views: 73
Reputation: 25073
It is always a good idea to send some headers with your request, e.g. a user-agent:
requests.get(link, headers={'User-Agent': 'Mozilla/5.0'})
Note: amazon really do not like to be scraped, so sooner or later they will detect your activity and may block you.
from bs4 import BeautifulSoup as bs
import requests
link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'
soup =bs(requests.get(link, headers={'User-Agent': 'Mozilla/5.0'}).text,"html.parser")
name = soup.find_all("span",class_= "a-profile-name")
print(name)
[<span class="a-profile-name">Amazon Customer</span>, <span class="a-profile-name">Shubam Kadam</span>, <span class="a-profile-name">Aditi Sharma</span>, <span class="a-profile-name">Moris lopez</span>, <span class="a-profile-name">tana tubin</span>]
Upvotes: 0