Reputation: 11
I'm new on stack overflow, I'm writing a script in python and I've got a doubt that I can resolve, I need to create a variable with the price of the product, by now I've collected decimal price in €, thanks to web scraping.
import bs4, requests
link = "https://hookpod.shop/products/hookpod-screw-adapter"
response = requests.get(link)
response.raise_for_status()
soup = bs4.BeatifulSoup(response.text, 'html.parster')
span_price = soup.find('span', class_='product__price')
what output gives to me is:
<span class="product__price" data-product-price=""> €10.00 </span>
I need to get the amount (€10.00) and transform it in a int
, is there anybody who can help me with, I really need it
Upvotes: 1
Views: 212
Reputation: 4975
The find method return a Tag object and you can access to its string via the text
attribute. Then you should remove the empty space around it with strip
, and the money-symbol, with a slice for example. The cast to float
and finally with int
.
from bs4 import BeautifulSoup
html = '<span class="product__price" data-product-price=""> €10.00 </span>'
span_price = BeautifulSoup(html,'lxml') # you can change parser
span_price_value = int(float(span_price.text.strip()[1:]))
print(span_price_value)
Remark:
lxml
)strip
then you should be careful with the slice of the string, not more at 1Upvotes: 1
Reputation: 51
There was a couple of typos so I am writing the full code. Use regex to get the digits out of the Euro prices you got already.
import bs4, requests
from bs4 import BeautifulSoup
link = "https://hookpod.shop/products/hookpod-screw-adapter"
response = requests.get(link)
response.raise_for_status()
soup = bs4.BeautifulSoup(response.text, 'html.parser')
span_price = soup.find('span', class_='product__price')
import re
result = re.search(r'\d+', span_price.text)
result_int = int(result.group())
result_int
Upvotes: 0
Reputation: 34
converting span_price text to int will solve it.
something like:
var int_span_price = int(span_price.text.replace('€', ''))
Upvotes: 1
Reputation: 11
use Beautiful Soup's tag system to lock on that data and soup.getText() to pull it out. You could also parse the results of the soup.find method you did there
Upvotes: 0
Reputation: 9
I recommend you to use https://pypi.org/project/price-parser/
To install it run pip install price-parser
>>> from price_parser import Price
>>> price = Price.fromstring("22,90 €")
>>> price
Price(amount=Decimal('22.90'), currency='€')
>>> price.amount # numeric price amount
Decimal('22.90')
>>> price.currency # currency symbol, as appears in the string
'€'
>>> price.amount_text # price amount, as appears in the string
'22,90'
>>> price.amount_float # price amount as float, not Decimal
22.9
Upvotes: 0