PacaAT
PacaAT

Reputation: 33

bs4 filtering with python

i'm trying to write a script that checks the steam store, and i'm having a problem with filtering out all of the listings that don't have a discount within their code. i want to keep only the listings with the span tag and the <span>-percentage</span> within them, and not the one without. here's my code:

from urllib.request import urlopen
from datetime import date
import requests as rq

inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")

for sale in sales:
    p = soup.find('div', class_="col search_price responsive_secondrow")
    d = soup.find_all('div', class_="col search_discount responsive_secondrow")
    n = soup.find('span', class_="title")

    if None in (d, n, p):
        continue
    print(d)

and the output (containing the things i want to filter out/the things i want to keep)

<span>-16%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
<span>-19%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">

etc etc. i've tried replacing d = soup.find_all('div', class_="col search_discount responsive_secondrow") with d = soup.find_all('span', string="-16%") to see if that would work and it didnt. i want to keep the span tags but not the div tags could anyone help with this?

Upvotes: 0

Views: 100

Answers (1)

Sushil
Sushil

Reputation: 5531

You can simply add a try-except block to the last for loop to solve your problem. Here is the full code:

from urllib.request import urlopen
from datetime import date
import requests as rq
from bs4 import BeautifulSoup as bsoup
inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")

final = []

for sale in sales:
    p = soup.find('div', class_="col search_price responsive_secondrow")
    d = soup.find_all('div', class_="col search_discount responsive_secondrow")
    n = soup.find('span', class_="title")

    try:
        for element in d:
            span = element.span
            if span:
                final.append(span.text)
    except:
        pass
print(final)

Output:

what would you like to search up?>? among us
['-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%']

Upvotes: 1

Related Questions