Reputation: 171
Trying to scrape the following html using the python bs4 script below. Keep getting an error (listed below). No idea whats causing it? If someone could help me figure out how to get it working then that would be great!
<span id="prodInfoPriceVat" class="prodInfoPriceVat" data-price-vat="24.73">£24.73</span>
Python BS4 script:
prices = {
"GLDAG_MAPLE": {"url": "https://www.gold.co.uk/silver-coins/candian-silver-maple-coins/1oz-canadian-maple-silver-coin-2020/",
"trader": "Gold.co.uk",
"metal": "Silver",
"type": "Maple"},
"BBPAG_MAPLE": {"url": "https://www.bullionbypost.co.uk/silver-coins/canadian-maple-1oz-silver-coin/2019-1oz-canadian-maple-silver-coin/",
"trader": "Bullion By Post",
"metal": "Silver",
"type": "Maple"},
"ATKAG_BRITANNIA": {"url": "https://atkinsonsbullion.com/silver/silver-coins/1oz-silver-coins/2020-uk-britannia-1oz-silver-coin",
"trader": "Atkinsons Bullion",
"metal": "Silver",
"type": "Britannia"},
}
response = requests.get(
'https://www.bullionbypost.co.uk/silver-price/silver-price-per-gram/')
soup = BeautifulSoup(response.text, 'html.parser')
AG_GRAM_SPOT = soup.find(
'span', {'name': 'current_price_field'}).get_text()
# Convert to float
AG_GRAM_SPOT = float(re.sub(r"[^0-9\.]", "", AG_GRAM_SPOT))
# No need for another lookup
AG_OUNCE_SPOT = AG_GRAM_SPOT * 31.1035
for coin in prices:
response = requests.get(prices[coin]["url"])
soup = BeautifulSoup(response.text, 'html.parser')
try:
text_price = soup.find(
'td', {'id': 'price-inc-vat-per-unit-1'}).get_text() # BullionByPost
except:
text_price = soup.find(
'td', {'id': 'total-price-inc-vat-1'}).get_text() # Gold.co.uk
else:
text_price = soup.find(
'span', {'class': 'prodInfoPriceVat'}).get_text() # Issues here!Line 70
# Grab the number
prices[coin]["price"] = float(re.sub(r"[^0-9\.]", "", text_price))
Keep getting this error: How do I fix it?
Traceback (most recent call last):
File "scraper.py", line 70, in <module>
text_price = soup.find(
AttributeError: 'NoneType' object has no attribute 'get_text'
How can I get this working?
Upvotes: 1
Views: 75
Reputation: 195593
No need to use exceptions here, just use if..else
and test if found element is not None
.
For example:
import re
import requests
from bs4 import BeautifulSoup
prices = {
"GLDAG_MAPLE": {"url": "https://www.gold.co.uk/silver-coins/candian-silver-maple-coins/1oz-canadian-maple-silver-coin-2020/",
"trader": "Gold.co.uk",
"metal": "Silver",
"type": "Maple"},
"BBPAG_MAPLE": {"url": "https://www.bullionbypost.co.uk/silver-coins/canadian-maple-1oz-silver-coin/2019-1oz-canadian-maple-silver-coin/",
"trader": "Bullion By Post",
"metal": "Silver",
"type": "Maple"},
"ATKAG_BRITANNIA": {"url": "https://atkinsonsbullion.com/silver/silver-coins/1oz-silver-coins/2020-uk-britannia-1oz-silver-coin",
"trader": "Atkinsons Bullion",
"metal": "Silver",
"type": "Britannia"},
}
response = requests.get(
'https://www.bullionbypost.co.uk/silver-price/silver-price-per-gram/')
soup = BeautifulSoup(response.text, 'html.parser')
AG_GRAM_SPOT = soup.find(
'span', {'name': 'current_price_field'}).get_text()
# Convert to float
AG_GRAM_SPOT = float(re.sub(r"[^0-9\.]", "", AG_GRAM_SPOT))
# No need for another lookup
AG_OUNCE_SPOT = AG_GRAM_SPOT * 31.1035
for coin in prices:
print('url=', prices[coin]["url"])
response = requests.get(prices[coin]["url"])
soup = BeautifulSoup(response.text, 'html.parser')
text_price = soup.find(
'td', {'id': 'price-inc-vat-per-unit-1'}) # BullionByPost
if not text_price:
text_price = soup.find(
'td', {'id': 'total-price-inc-vat-1'}) # Gold.co.uk
if not text_price:
text_price = soup.find(
'span', {'class': 'prodInfoPriceVat'}) # atkinsonsbullion.com
if not text_price:
print('Error, unable to fint price for url=', prices[coin]["url"])
prices[coin]["price"] = float('nan')
continue
text_price = text_price.get_text(strip=True)
# Grab the number
prices[coin]["price"] = float(re.sub(r"[^0-9\.]", "", text_price))
print('price=', prices[coin]["price"])
Prints:
url= https://www.gold.co.uk/silver-coins/candian-silver-maple-coins/1oz-canadian-maple-silver-coin-2020/
price= 31.32
url= https://www.bullionbypost.co.uk/silver-coins/canadian-maple-1oz-silver-coin/2019-1oz-canadian-maple-silver-coin/
price= 26.88
url= https://atkinsonsbullion.com/silver/silver-coins/1oz-silver-coins/2020-uk-britannia-1oz-silver-coin
price= 24.73
Upvotes: 1