How to format a scraper output

Question

I'm trying to extrapolate the prices out of one site in order to create a scraper I wrote the program down below. In order to get all the html code i used BeautifulSoup and the default html.parser. then I tried cleaning up the information by using a variable called generale equals to soup.findAll("span"). then I need to clean up furthermore (the list (i suppose) it has been created) in order to get to the prices and I got stuck. Any suggestions? I do not know how to think in order to solve the problem

import smtplib

import time

from bs4 import BeautifulSoup as bs

import requests

URL = "https://www.allkeyshop.com/blog/buy-battlefield-5-cd-key-compare-prices/"

headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"}

def Check_page1():

    page = requests.get(URL, headers=headers)

    soup = bs(page.content, 'html.parser')

    generale = soup.findAll('span')

    price = ?

    print(price)

    print(generale)

print(Check_page1())

Jan Lipovsk&#253; · Accepted Answer

When you look at the source code of the page you can see that you are looking for with class name price, And it can be parsed this way:

import time

import requests
from bs4 import BeautifulSoup as bs

URL = "https://www.allkeyshop.com/blog/buy-battlefield-5-cd-key-compare-prices/"
headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"}

def CheckPage1():
    page = requests.get(URL, headers=headers)
    soup = bs(page.content, 'html.parser')

    # all spans with prices
    span_prices = soup.findAll("span", {"class": "price"})

    # to get all prices you need to extract text or content attribute
    for span in span_prices:
        price = span.text
        # remove whitespace and print price
        print(price.strip())

        # to get prices without money sign uncomment one of those lines
        # print(price.strip()[:-1])
        # print(price.strip().strip('€'))

CheckPage1()

How to format a scraper output

Answers (2)

Related Questions