Reputation: 33
i'm trying to write a script that checks the steam store, and i'm having a problem with filtering out all of the listings that don't have a discount within their code. i want to keep only the listings with the span tag and the <span>-percentage</span>
within them, and not the one without. here's my code:
from urllib.request import urlopen
from datetime import date
import requests as rq
inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")
for sale in sales:
p = soup.find('div', class_="col search_price responsive_secondrow")
d = soup.find_all('div', class_="col search_discount responsive_secondrow")
n = soup.find('span', class_="title")
if None in (d, n, p):
continue
print(d)
and the output (containing the things i want to filter out/the things i want to keep)
<span>-16%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
<span>-19%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
etc etc.
i've tried replacing d = soup.find_all('div', class_="col search_discount responsive_secondrow")
with d = soup.find_all('span', string="-16%")
to see if that would work and it didnt.
i want to keep the span tags but not the div tags
could anyone help with this?
Upvotes: 0
Views: 100
Reputation: 5531
You can simply add a try-except
block to the last for
loop to solve your problem. Here is the full code:
from urllib.request import urlopen
from datetime import date
import requests as rq
from bs4 import BeautifulSoup as bsoup
inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")
final = []
for sale in sales:
p = soup.find('div', class_="col search_price responsive_secondrow")
d = soup.find_all('div', class_="col search_discount responsive_secondrow")
n = soup.find('span', class_="title")
try:
for element in d:
span = element.span
if span:
final.append(span.text)
except:
pass
print(final)
Output:
what would you like to search up?>? among us
['-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%']
Upvotes: 1