Python2: Testing output of functions returning unicode strings

Question

I have a function which works with unicode internally, and I would like to test it using py.test. Currently, I have the following code:

def test_num2word():
    assert num2word(2320)  == u"dva tisíce tři sta dvacet"

However, the assertion fails with:

E       assert u'dva tis\xed...i sta dvacet ' == u'dva tis\xc3\...9i sta dvacet'
E         - dva tis\xedce t\u0159i sta dvacet 
E         ?        ^    ^            -
E         + dva tis\xc3\xadce t\xc5\x99i sta dvacet
E         ?

As I understand, my function correctly returns unicode, which it then tries to compare to an utf-8 encoded string, which (obviously) fails. Yet I thought using u"..." in my source would also convert the string to the same encoding used internally by Python.

My question is, is there a sane way of comparing these, or do I need to pepper each test statement with a decode('utf-8') (on the right-hand side) or an encode('utf-8') (on the left side. Even if I write a wrapper function, this doesn't strike me as ideal -- there must be a way to compare this sanely! No, using Python 3 is not an option.

Python2: Testing output of functions returning unicode strings

Answers (1)

Related Questions