lasse3434
lasse3434

Reputation: 59

Python Web-Scraper using BeautifulSoup - Find the right html line for the information im looking for

I just switched from C to Python and to get some practice I want to code a simple web-scraper for price comparison. It works so far that the program goes to every website I tell it to, and it gives me the information of the website back as html. But when I try to tell BeautifulSoup to just find the prices, the output is 'None'. So I think that the html address, I am passing to BeautifulSoup as the price information is wrong.

I would be really grateful if anyone could help me with that problem or just has some tips and tricks for a beginner! I will add my python code and the link to the website since it looks kinda messy if I put the html code here, just tell me if you need anything more. Thank you!

https://www.momox.de/offer/9783833879500

I would just need the part where it says 8,87€ (or whatever €, price is changing constantly), but looks like i got the wrong part of the html code..

 import requests
from bs4 import BeautifulSoup

def GetISBN():
    Lagerbestand = open("ISBN.txt", "r")
    for ISBN in Lagerbestand.readlines():
        url = f'https://www.momox.de/offer/{ISBN}'
        page = requests.get(url)
        soup = BeautifulSoup(page.content, 'html.parser')
        price= soup.find('div',attrs={'class':'text-center text-xxl font-medium searchresult-price'})
        print(price)
    Lagerbestand.close()

GetISBN()

The ISBN.txt is a file with 1 article number per line (This part works)

Upvotes: 1

Views: 316

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195573

The price is loaded with Ajax request from external URL. You can use this example how to load it using requests module:

import json
import requests

api_url = "https://api.momox.de/api/v4/offer/"
params = {"ean": "9783833879500"}  # <--- change this to your EAN
headers = {
    "X-API-TOKEN": "2231443b8fb511c7b6a0eb25a62577320bac69b6",
    "X-CLIENT-VERSION": "r5299-76accf6",
    "X-MARKETPLACE-ID": "momox_de",
}

data = requests.get(api_url, params=params, headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

print(data["price"])

Prints:

11.44

Upvotes: 1

Related Questions