Reputation: 27
i am displaying prices of graphics cards from newegg using web scraping. On some of the text i scrape there is unwanted text after the price that gets scraped too. what is the most efficient way to only display the text of the price nothing more.
price_container = container.findAll("li", {"class": "price-current"})
price = price_container[0].text
if len(price) > 7:
the prices(bit i want to keep) are never more than 7 characters long so i thought i could remove the unwanted text using this if statement but im not sure how because each price has different length of unwanted text after it.
Upvotes: 0
Views: 69
Reputation: 14
if len(price) > 7:
price = price[:-1] #This will reasign the string/list to a string/list with all the characters except for the last one.
Upvotes: 0
Reputation: 44248
Use a regular expression:
import re
m = re.search(r'\$([\d.]+)', price)
if m:
print(m.group(0)) # to include the dollar sign
print(m.group(1)) # the amount without the dollar sign
Upvotes: 1
Reputation: 106
You can either use a regular expression.
Or take a string and extract the numbers from it. Example:
[float(p) for p in price.split() if p.isdigit()] # Will give you an array of the numbers in the string. You can then join them back together.
Perhaps not exactly what you are looking for, but hopefully will help you :)
Upvotes: 1