ch11nV11n
ch11nV11n

Reputation: 43

Converting string item in list to float

I am trying to convert the last 'price' item in my list to an actual float and not a string in my output. Is this possible?

OUTPUT

{'name': 'ADA Hi-Lo Power Plinth Table', 'product_ID': '55984', 'price': '$2,849.00'}
{'name': 'Adjustable Headrest Couch - Chrome-Plated Steel Legs', 'product_ID': '31350', 'price': '$729.00'}
{'name': 'Adjustable Headrest Couch - Chrome-Plated Steel Legs (X-Large)', 'product_ID': '31351', 'price': '$769.00'}
{'name': 'Adjustable Headrest Couch - Hardwood Base (No Drawers)', 'product_ID': '65446', 'price': '$1,059.00'}      
{'name': 'Adjustable Headrest Couch - Hardwood Base 2 Drawers', 'product_ID': '65448', 'price': '$1,195.00'}
{'name': 'Adjustable Headrest Couch - Hardwood Tapered Legs', 'product_ID': '31355', 'price': '$735.00'}
{'name': 'Adjustable Headrest Couch - Hardwood Tapered Legs (X-Large)', 'product_ID': '31356', 'price': '$775.00'}
{'name': 'Angeles Rest Standard Cot Sheets - ABC Print', 'product_ID': 'A31125', 'price': '$11.19'}

START OF PYTHON SCRIPT

import requests
from bs4 import BeautifulSoup
import sys

with open('recoveryCouches','r') as html_file:
    content= html_file.read()
    soup = BeautifulSoup(content,'lxml')
    allProductDivs = soup.find('div', class_='product-items product-items-4')
    nameDiv = soup.find_all('div',class_='name')
    prodID = soup.find_all('span', id='product_id')
    prodCost = soup.find_all('span', class_='regular-price')

    records=[]
     
    for i in range(len(nameDiv)):
        records.append({
            "name": nameDiv[i].find('a').text.strip(),
            "product_ID": prodID[i].text.strip(),
            "price": prodCost[i].text.strip()
            })

    for x in records:
        print(x)

Upvotes: 0

Views: 247

Answers (2)

Reinderien
Reinderien

Reputation: 15221

Naive removal of the currency symbol prefix makes your code non-i18n-compatible and fragile. The general solution is a little complicated, but if you assume that the currency symbol remains a prefix and that's a Canadian dollar symbol, then:

from locale import setlocale, LC_ALL, localeconv, atof
from decimal import Decimal
import re

setlocale(LC_ALL, ('en_CA', 'UTF-8'))

# ...

price_str = re.sub(r'\s', '', prodCost[i].text)
stripped = price_str.removeprefix(localeconv()['currency_symbol'])
price = atof(stripped, Decimal)

Also note that Decimal is a better representation of a currency than a float for most purposes.

Upvotes: 0

imxitiz
imxitiz

Reputation: 3987

You can try this, since you can't convert both $ and , to float. You can replace both of them, and convert.

You may use re module to replace them at once :

import re

for i in range(len(nameDiv)):
    records.append({
        "name": nameDiv[i].find('a').text.strip(),
        "product_ID": prodID[i].text.strip(),
        "price": float(re.sub(r"[$,]","",prodCost[i].text.strip()))
            })

Or if all of the string have $ at first the you can follow @Forest comment,

float(price[1:].replace(',', ''))

Like this:

float(prodCost[i].text.strip()[1:].replace(",",""))

Upvotes: 1

Related Questions