Filip Nilsson
Filip Nilsson

Reputation: 115

UTF-8 string as key in dictionary causes KeyError

I have a dictionary with unicode strings as keys. When I try to access the value I get key error, even though the printout of the key in dictionary and my key are equal:

>>> test = "Byggår"
>>> key = raw_dict.keys()[7]
>>> print(test)
Byggår
>>> print(key)
Byggår
>>> test
'Bygg\xc3\xa5r'
>>> key
u'Bygg\xe5r'
>>> raw_dict[test]
Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd_exec.py", line 3, in Exec
  File "<input>", line 1, in <module>
KeyError: 'Bygg\xc3\xa5r'

It seems as they are encoded differently, somehow. From experimenting it seems as the key in the dictionary is encoded as octal bytes (?) http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=xc3+xa5&mode=obytes, whilst the key I try to access the value with is encoded as hex(?) http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=xe5&mode=hex .

The keys in dictionary are fetched from a web source, so I guess something gets messed up on the way.

Upvotes: 1

Views: 5735

Answers (1)

zmbq
zmbq

Reputation: 39023

Your test is a string while key is a unicode string. See the u in-front of it?

You should either use Python 3, where all strings are unicode strings, or make sure to convert test to unicode before looking for it in the dictionary.

Upvotes: 5

Related Questions