Reputation: 33
I'm building a program that is able to replace characters in a message with characters the user has entered into a dictionary. Some of the characters are given in a text file. So, to import them, I used this code:
d = {}
with open("dictionary.txt") as d:
for line in d:
(key, val) = line.split()
d[str(key)] = val
It works well, except it adds "" to the start of the dictionary. The array of to-be-replaced text is called 'words'. This is the code I have for that:
for each in d:
words = ";".join(words)
words = words.replace(d[each],each)
words = words.split(";")
print words
When I hit F5, however, I get a load of gobbledook. Here's an example: \xef\xbb\xbf\xef\xbb\xbfA+/084&
I'm just a newbie at Python, so any help would be appreciated.
Upvotes: 0
Views: 143
Reputation: 11152
Ensure to save your file in dictionnary file in UTF-8. With notepad++ (Windows) there are conversion functions if your former file is not utf-8.
The "" pattern is related to latin-1 encoding (you won't have it if you use utf-8 encoding)
Then, instead of str(key), use key.encode("utf-8") to avoid possible other errors in the future.
If you want to know more, you can take a look to the good Python documentation about this : http://docs.python.org/2/howto/unicode.html
Upvotes: 1