user1975126
user1975126

Reputation: 51

How do I solve the UnicodeWarning issue?

I spent about four hours researching the "UnicodeWarning: Unicode unequal comparison" issue. Usually, after a few hours, I'm able to answer my trickiest questions by myself, but that wasn't the case here. And I mean "tricky" for myself, of course. ;-)

I know that similar questions are answered online and also on this site, but being too noob to understand the answer well doesn't help me at all. Maybe the best way for me to get it is just having someone point out what needs to be changed in my code.

I use Python 2.5 on Windows XP.

What I was able to figure out

I understand that my problem has to do with me trying to compare apple and oranges (or Unicode and ASCII, or something like that, like maybe bytes). What I don't know is a practical way to solve this.

Here is my code:

# coding: iso-8859-1
import sys
from easygui import *

actual_answer = "pureté"
answer_given = enterbox("Type your answer!\n\nHint: 'pureté'")

if answer_given == actual_answer:
    msgbox("Correct! The answer is 'pureté'")
else:
    msgbox("Bug!")

Here is the error message I get:

UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

Upvotes: 2

Views: 3848

Answers (2)

Dayan
Dayan

Reputation: 8031

Here's a function to return proper utf-8 formatting:

  def utf8(str):
      return unicode(str, 'latin1').encode('utf-8')

Also, have you tried using unicode escapes?

print "puret\u00E9".decode("unicode_escape")

For example you can apply this to your code as so:

# coding: iso-8859-1
import sys
from easygui import *

actual_answer = "puret\u00E9".decode("unicode_escape")
answer_given = enterbox("Type your answer!\n\nHint: " + actual_answer)

if answer_given == actual_answer:
    msgbox("Correct! The answer is " + actual_answer)
else:
    msgbox("Bug!")

Refer to Python docs for more detailed information on Unicode Escapes. http://docs.python.org/2/howto/unicode.html

Upvotes: 0

jsbueno
jsbueno

Reputation: 110311

First, read this: http://www.joelonsoftware.com/articles/Unicode.html

Then - you should not really use iso-8859-1 encoding when dealing with Python in whatever system - use utf-8 instead.

Third, your easygui component is returning you a unicode object instead of a byte-string. The easiest way to fix that in the above code is to make the actual_answer variable an unicode object, but prefixing an "u" to the quotes, like in:

actual_answer = u"pureté"

Upvotes: 1

Related Questions