Reputation: 223
I am experimenting with lmxl, requests and xpath in Python. I've searched quite a bit but I can't seem to find a solution to my problem.
I have the following simple piece of code:
from lxml import html
import requests
page = requests.get('http://url.tld/property-for-sale/')
tree = html.fromstring(page.content)
property_link = tree.xpath('//div[@class="search_result_title_box"]/h2/a/@href')
property_title = tree.xpath('//div[@class="search_result_title_box"]/h2/a/text()')
property_price = tree.xpath('//div[@class="info-box"]/strong[@class="price"]/text()')
print property_title
print property_price
I am retrieving property_title and property_price and printing them out. However, when I print property price I am retrieving a currency field and I see it returned as something similar to the following:
[u'\u20ac225,000', u'\u20ac1,000,000', u'\u20ac245,000',.... etc.... ]
How can I resolve this so that I am either pulling numbers or formatted currency?
Thanks
Upvotes: 3
Views: 75
Reputation: 369274
It's how the unicode inside list is represented. If you print each item, it will printed as you want:
>>> prices = [u'\u20ac225,000', u'\u20ac1,000,000', u'\u20ac245,000']
>>> print prices
[u'\u20ac225,000', u'\u20ac1,000,000', u'\u20ac245,000']
>>> for price in prices:
... print price
...
€225,000
€1,000,000
€245,000
Upvotes: 1