andy
andy

Reputation: 41

Can't decode Unicode

I have the following code

results = requests.get("https://www.kimonolabs.com/api/ano64pm6?apikey=9ummN7C6KMHu9aErm49ixoy2ZySmaKCm").json()
mmoga = ([x["price"] for x in results["results"]["collection1"]])
print mmoga

This outputs the following:

[u'\xa3\xa04.03', u'\xa3\xa06.02', u'\xa3\xa07.99', u'\xa3\xa09.96', u'\xa
\xa011.91', u'\xa3\xa013.84', u'\xa3\xa015.76', u'\xa3\xa017.67', u'\xa
\xa019.56', u'\xa3\xa029.24', u'\xa3\xa038.84', u'\xa3\xa048.38', u'\xa
\xa057.84', u'\xa3\xa067.23', u'\xa3\xa076.56', u'\xa3\xa085.81', u'\xa
\xa094.99', u'\xa3\xa0113.57', u'\xa3\xa0132.00', u'\xa3\xa0150.29',u'\xa3
\xa0168.45', u'\xa3\xa0186.46', u'\xa3\xa0204.33', u'\xa3  \xa0222.06',
u'\xa3    \xa0239.65', u'\xa3\xa0257.10', u'\xa3\xa0274.43']
[u'\xa3\xa04.03', u'\xa3\xa06.02', u'\xa3\xa07.99', u'\xa3\xa09.96', u'\xa  
\xa011.91', u'\xa3\xa013.84', u'\xa3\xa015.76', u'\xa3\xa017.67', u'\xa3    
\xa019.56', u'\xa3\xa029.24', u'\xa3\xa038.84', u'\xa3\xa048.38', u'\xa3
\xa057.84', u'\xa3\xa067.23', u'\xa3\xa076.56', u'\xa3\xa085.81', u'\xa3
\xa094.99', u'\xa3\xa0113.57', u'\xa3\xa0132.00', u'\xa3\xa0150.29', u'\xa3
\xa0168.45', u'\xa3\xa0186.46', u'\xa3\xa0204.33', u'\xa3\xa0222.06', u'\xa3
\xa0239.65', u'\xa3\xa0257.10', u'\xa3\xa0274.43']

I then try to get rid of all the letters using the following code:

while i< len(mmoga):
mmoga[i] = mmoga[i].translate(None, 'absdefghijklmnopqrstuvwxyz;&£$')
i+=1

This gives the error message

 translate() takes exactly one argument (2 given)

From some searching I think that that this is due to the Unicode not being decoded but I am very new to Python and all the solutions I have found are in Python 3.

Upvotes: 0

Views: 488

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121158

You successfully decoded the contents, but the \xa3 and \xa0 characters may be confusing you. These are simply the U+00A3 POUND SIGN and U+00A0 NO-BREAK SPACE characters, but Python only includes ASCII characters in unicode string representations, the rest are shown as escape sequences here.

Print one of those values individually:

>>> print u'\xa3\xa04.03'
£ 4.03

The unicode.translate() method works differently from the str.translate. To delete characters, put them in the first argument as a dictionary mapping to None.

I'd use str.strip() here:

>>> u'\xa3\xa04.03'.strip(u'\xa3\xa0')
u'4.03'

Upvotes: 2

Related Questions