Reputation: 59
I am trying to generate those urls of the product where price is visible on the listing page i.e https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000
and my goal is to skip the remaining of the ASINS where price is not on listing page.
Logic I came up with is something like this:
Tag
which contains all the product listing.Tag
with if
and else
condition to extract those specific products with price.I am struggling through execution I had done some web scraping few months ago and right now I am bit of rusty and trying to keep up with this so any help would be much appreciated.
Here is my function:
from requests_html import HTMLSession
s = HTMLSession()
def get_product_links(session): #
# https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000
url = session.get(
base_url + search_term + price_filter,
headers=headers,
)
print(url.status_code)
tag = url.html.find("div[data-component-type=s-search-result]")
price_tag = [pr.find("span.a-offscreen", first=True) for pr in tag]
print(price_tag)
check_price = [price.text for price in price_tag if price != None]
print(check_price)
if len(check_price) > 0:
product_asins = [
asin.attrs["data-asin"]
for asin in url.html.find("div[data-asin]")
if asin.attrs["data-asin"] != ""
]
product_link = [
"https://www.amazon.com/dp/" + link for link in product_asins
]
return product_link
else:
print("Skipping Product...")
Upvotes: 0
Views: 540
Reputation: 195468
To get links only for products which have price you can use this example:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0",
"Accept-Language": "en-US,en;q=0.5",
}
url = "https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000"
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
for asin in soup.select("[data-asin]"):
num = asin["data-asin"].strip()
price = asin.select_one(".a-price .a-offscreen")
if num and price:
print(asin.h2.text)
print(price.text, "https://www.amazon.com/dp/{}".format(num))
print()
Prints:
HexGaming Esports Ultimate Controller 4 Remap Buttons & Interchangeable Thumbsticks & Hair Trigger Compatible with PS5 Customized Controller PC Wireless FPS Esport Gamepad - Wild Attack
$289.99 https://www.amazon.com/dp/B09KMYCY1C
Samsung Electronics 980 PRO SSD with Heatsink 2TB PCIe Gen 4 NVMe M.2 Internal Solid State Hard Drive, Heat Control, Max Speed, PS5 Compatible, MZ-V8P2T0CW
$349.99 https://www.amazon.com/dp/B09JHKSNNG
G-STORY 15.6" Inch IPS 4k 60Hz Portable Monitor Gaming display Integrated with PS5(not included) 3840×2160 With 2 HDMI ports,FreeSync,Built-in 2 of Multimedia Stereo Speaker,UL Certificated AC Adapter
$379.99 https://www.amazon.com/dp/B073ZJ1K8G
Thrustmaster T248, Racing Wheel and Magnetic Pedals, HYBRID DRIVE, Magnetic Paddle Shifters, Dynamic Force Feedback, Screen with Racing Information (PS5, PS4, PC)
$399.99 https://www.amazon.com/dp/B08Z5CX6V2
WD_BLACK 1TB SN850 NVMe Internal Gaming SSD Solid State Drive with Heatsink - Works with Playstation 5, Gen4 PCIe, M.2 2280, Up to 7,000 MB/s - WDS100T1XHE
$189.99 https://www.amazon.com/dp/B08PHSVW7K
Thrustmaster T300 RS - Gran Turismo Edition Racing Wheel (PS5,PS4,PC)
$449.99 https://www.amazon.com/dp/B01M1L2NRL
Seagate FireCuda 530 2TB Internal Solid State Drive - M.2 PCIe Gen4 ×4 NVMe 1.4, PS5 Internal SSD, speeds up to 7300MB/s, 3D TLC NAND, 2550 TBW, 1.8M MTBF, Heatsink, Rescue Services (ZP2000GM3A023)
$399.99 https://www.amazon.com/dp/B0977K2C74
OWC 2TB Aura P12 Pro NVMe M.2 SSD
$329.00 https://www.amazon.com/dp/B07VZ79XQ6
Sabrent 2TB Rocket 4 Plus NVMe 4.0 Gen4 PCIe M.2 Internal Extreme Performance SSD + M.2 NVMe Heatsink for The PS5 Console (SB-RKT4P-PSHS-2TB)
$329.99 https://www.amazon.com/dp/B09G2MZ4VR
Sony Playstation PS4 1TB Black Console
$468.00 https://www.amazon.com/dp/B012CZ41ZA
Thrustmaster TH8A Shifter (PS5, PS4, XBOX Series X/S, One, PC)
$199.99 https://www.amazon.com/dp/B005L0Z2BQ
WD_BLACK 2TB P50 Game Drive SSD - Portable External Solid State Drive, Compatible with Playstation, Xbox, PC, & Mac, Up to 2,000 MB/s - WDBA3S0020BBK-WESN
$348.99 https://www.amazon.com/dp/B07YFG9PG2
Logitech G923 Racing Wheel and Pedals for PS 5, PS4 and PC featuring TRUEFORCE up to 1000 Hz Force Feedback, Responsive Pedal, Dual Clutch Launch Control, and Genuine Leather Wheel Cover
$399.98 https://www.amazon.com/dp/B07PFB72NL
GIGABYTE AORUS Gen4 7000s SSD 2TB PCIe 4.0 NVMe M.2, Nanocarbon Coated Aluminum Heatsink, 3D TLC NAND, SSD- GP-AG70S2TB
$319.99 https://www.amazon.com/dp/B08XY93JT3
THRUSTMASTER T-LCM Pedals (PS5, PS4, XBOX Series X/S, One, PC
$229.99 https://www.amazon.com/dp/B083MNB4D8
Upvotes: 2