raju
raju

Reputation: 4978

How to compare two non-ASCII strings

I want to compare Technical Diploma (±12 years) with the same string present in the browser. I am running a webdriver test in Python where it fetches Technical Diploma (±12 years) from a db and tries to compare with a string present in the browser. I am getting this error when I try to compare

UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

How do I compare these non-ASCII strings in Python?

Upvotes: 1

Views: 2667

Answers (2)

the wolf
the wolf

Reputation: 35512

Python is telling you the problem: Convert to UTF-8 first.

Example:

>>> u1='Technical Diploma (±12 years)'
>>> u2=u'Technical Diploma (±12 years)'
>>> u1==u2
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False
>>> u1.decode('utf-8')==u2
True
>>> 

Upvotes: 1

Martijn Pieters
Martijn Pieters

Reputation: 1121176

One of your strings is not a unicode value, but a bytestring. You want to convert that to unicode by decoding it first:

'Non-ASCII value containing UTF8: \xc2\xb1'.decode('utf8')

but you will have to figure out what encoding the bytestring is in in the first place.

If you have defined a source file encoding and you are defining the string as a literal in your code, make sure you define it as a Unicode literal by prefixing the string with a u'':

u'Technical Diploma (±12 years)'

I strongly recommend you read up on the Python Unicode HOWTO before you proceed, however.

Upvotes: 6

Related Questions