Abbas
Abbas

Reputation: 59

Extract only ASINS from product listing page where Price is visible on Amazon

I am trying to generate those urls of the product where price is visible on the listing page i.e https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000 and my goal is to skip the remaining of the ASINS where price is not on listing page.

Logic I came up with is something like this:

I am struggling through execution I had done some web scraping few months ago and right now I am bit of rusty and trying to keep up with this so any help would be much appreciated.

Here is my function:

from requests_html import HTMLSession

s = HTMLSession()

def get_product_links(session):  #
        # https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000
        url = session.get(
            base_url + search_term + price_filter,
            headers=headers,
        )
        print(url.status_code)

        tag = url.html.find("div[data-component-type=s-search-result]")

        price_tag = [pr.find("span.a-offscreen", first=True) for pr in tag]
        print(price_tag)
        check_price = [price.text for price in price_tag if price != None]
        print(check_price)
        if len(check_price) > 0:
            product_asins = [
                asin.attrs["data-asin"]
                for asin in url.html.find("div[data-asin]")
                if asin.attrs["data-asin"] != ""
            ]
            product_link = [
                "https://www.amazon.com/dp/" + link for link in product_asins
            ]
            return product_link
        else:
            print("Skipping Product...")

Upvotes: 0

Views: 540

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195468

To get links only for products which have price you can use this example:

import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0",
    "Accept-Language": "en-US,en;q=0.5",
}

url = "https://www.amazon.com/s?k=ps5&rh=p_36%3A27500-65000"

soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")

for asin in soup.select("[data-asin]"):
    num = asin["data-asin"].strip()
    price = asin.select_one(".a-price .a-offscreen")
    if num and price:
        print(asin.h2.text)
        print(price.text, "https://www.amazon.com/dp/{}".format(num))
        print()

Prints:

HexGaming Esports Ultimate Controller 4 Remap Buttons & Interchangeable Thumbsticks & Hair Trigger Compatible with PS5 Customized Controller PC Wireless FPS Esport Gamepad - Wild Attack  
$289.99 https://www.amazon.com/dp/B09KMYCY1C

Samsung Electronics 980 PRO SSD with Heatsink 2TB PCIe Gen 4 NVMe M.2 Internal Solid State Hard Drive, Heat Control, Max Speed, PS5 Compatible, MZ-V8P2T0CW  
$349.99 https://www.amazon.com/dp/B09JHKSNNG

G-STORY 15.6" Inch IPS 4k 60Hz Portable Monitor Gaming display Integrated with PS5(not included) 3840×2160 With 2 HDMI ports,FreeSync,Built-in 2 of Multimedia Stereo Speaker,UL Certificated AC Adapter  
$379.99 https://www.amazon.com/dp/B073ZJ1K8G

Thrustmaster T248, Racing Wheel and Magnetic Pedals, HYBRID DRIVE, Magnetic Paddle Shifters, Dynamic Force Feedback, Screen with Racing Information (PS5, PS4, PC)  
$399.99 https://www.amazon.com/dp/B08Z5CX6V2

WD_BLACK 1TB SN850 NVMe Internal Gaming SSD Solid State Drive with Heatsink - Works with Playstation 5, Gen4 PCIe, M.2 2280, Up to 7,000 MB/s - WDS100T1XHE  
$189.99 https://www.amazon.com/dp/B08PHSVW7K

Thrustmaster T300 RS - Gran Turismo Edition Racing Wheel (PS5,PS4,PC)  
$449.99 https://www.amazon.com/dp/B01M1L2NRL

Seagate FireCuda 530 2TB Internal Solid State Drive - M.2 PCIe Gen4 ×4 NVMe 1.4, PS5 Internal SSD, speeds up to 7300MB/s, 3D TLC NAND, 2550 TBW, 1.8M MTBF, Heatsink, Rescue Services (ZP2000GM3A023)  
$399.99 https://www.amazon.com/dp/B0977K2C74

OWC 2TB Aura P12 Pro NVMe M.2 SSD  
$329.00 https://www.amazon.com/dp/B07VZ79XQ6

Sabrent 2TB Rocket 4 Plus NVMe 4.0 Gen4 PCIe M.2 Internal Extreme Performance SSD + M.2 NVMe Heatsink for The PS5 Console (SB-RKT4P-PSHS-2TB)  
$329.99 https://www.amazon.com/dp/B09G2MZ4VR

Sony Playstation PS4 1TB Black Console  
$468.00 https://www.amazon.com/dp/B012CZ41ZA

Thrustmaster TH8A Shifter (PS5, PS4, XBOX Series X/S, One, PC)  
$199.99 https://www.amazon.com/dp/B005L0Z2BQ

WD_BLACK 2TB P50 Game Drive SSD - Portable External Solid State Drive, Compatible with Playstation, Xbox, PC, & Mac, Up to 2,000 MB/s - WDBA3S0020BBK-WESN  
$348.99 https://www.amazon.com/dp/B07YFG9PG2

Logitech G923 Racing Wheel and Pedals for PS 5, PS4 and PC featuring TRUEFORCE up to 1000 Hz Force Feedback, Responsive Pedal, Dual Clutch Launch Control, and Genuine Leather Wheel Cover  
$399.98 https://www.amazon.com/dp/B07PFB72NL

GIGABYTE AORUS Gen4 7000s SSD 2TB PCIe 4.0 NVMe M.2, Nanocarbon Coated Aluminum Heatsink, 3D TLC NAND, SSD- GP-AG70S2TB  
$319.99 https://www.amazon.com/dp/B08XY93JT3

THRUSTMASTER T-LCM Pedals (PS5, PS4, XBOX Series X/S, One, PC  
$229.99 https://www.amazon.com/dp/B083MNB4D8

Upvotes: 2

Related Questions