frlan
frlan

Reputation: 7270

How to correct test for unicode strings with Django and Python 2

I need to test, whether the representation of my Django models are working with Unicode as it might be that users insert th9ings like ü or ¼ into it. To do so, I have this Django tests.py

# -*- coding: utf-8 -*-
from django.conf import settings
from django.contrib.auth.models import User
from django.core.urlresolvers import reverse
from django.test import TestCase
from django.utils import timezone

from .models import *
from .views import *

class CategoryTestCase(TestCase):
    """ Test to check whether category name is printed correctly.
        If there is a parent, it should be also printed seperated by a : """

    def setUp(self):
        self.cat1 = Category.objects.create(name=u'Category 1')
        self.cat2 = Category.objects.create(name=u'Category ü', parent=self.cat1)
        self.cat3 = Category.objects.create(name=u'Category 3', parent=self.cat2)

    def test_category_name(self):
        cat_result1 = u'Category 1'
        cat_result2 = u'Category 1' + settings.PARENT_DELIMITER + u'Category ü'
        cat_result3 = u'Category 1' + settings.PARENT_DELIMITER + u'Category ü' + settings.PARENT_DELIMITER + u'Category 3'
        self.assertEqual(self.cat1.__str__(), cat_result1)
        self.assertEqual(self.cat2.__str__(), cat_result2)
        self.assertEqual(self.cat3.__str__(), cat_result3)

This is intended to test this little model:

#...
from django.utils.encoding import python_2_unicode_compatible
#....
@python_2_unicode_compatible
class Category(models.Model):
    """ Representing a category a part might contains to.
    E.g. resistor """

    name = models.CharField(
        max_length=50,
        help_text=_("Name of the category.")
    )
    parent = models.ForeignKey(
        "self",
        null=True,
        blank=True,
        help_text=_("If having a subcateogry, the parent.")
    )
    description = models.TextField(
        _("Description"),
        blank=True,
        null=True,
        help_text=_("A chance to summarize usage of category.")
    )

    def __str__(self):
        if self.parent is None:
            return ('{}'.format(self.name))
        else:
            return ('%s%s%s' % (
                self.parent.__str__(),
                settings.PARENT_DELIMITER,
                self.name)
            )

    def get_parents(self):
        """ Returns a list with parants of that StoragePare incl itself"""
        result = []
        next = self
        while True:
            if next.id in result:
                raise(CircleDetectedException(
                    _('There seems to be a circle inside ancestors of %s.' % self.id)))
            else:
                result.append(next.id)
                if next.parent is not None:
                    next = next.parent
                else:
                    break
        return result

    def clean(self):
        pass

(stripped that a little)

When running this code via Python 3 and test or with Python2/3 executing as application it's working. Only the test with Python2 is failing so I assume it's something wrong with my idea how to test this. Based on the error message it appears that the Unicode string is somewhere not properly encoded and decoded.

======================================================================
FAIL: test_category_name (partsmanagement.tests.CategoryTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/frlan/quellen/partuniverse/partuniverse/partsmanagement/tests.py", line 31, in test_category_name
    self.assertEqual(self.cat2.__str__(), cat_result2)
AssertionError: 'Category 1->Category \xc3\xbc' != u'Category 1->Category \xfc'

So my question is: How to do proper Unicode-representation testing with Django.

Upvotes: 3

Views: 2949

Answers (3)

hynekcer
hynekcer

Reputation: 15558

tl; dr ) Use the same type on both sides of assertEqual.

The best code, universal for Python 3 and 2 is possible without adding symbols like u'', unicode, foo.__str__() etc. Less code should be written and more thought about both what type is expected where in Python 2/3.

part A) fix the test

A short (ugly) solution is to use text_type on the left side if you use the same explicit type u'...' on the right side. Avoid to use functions with underscores due to better readability. You can convert a value to text_type by several ways:

Replace the line in test by

self.assertEqual(u'%s' % self.cat1, cat_result1)

or

self.assertEqual(u'{}'.format(self.cat1), cat_result1)

or

from django.utils.six import text_type
self.assertEqual(text_type(self.cat1), cat_result1)

A nicer solution is to unify the type in your module and to use from __future__ import unicode_literals at the beginning of your module because you work mostly with texts not with binary data. Than you can remove all u', but it is still useful until everything works in both versions.

part B) fix the __str__ method

Your code would fail if both the parent category name and the current name are not in ASCII. Fix:

from __future__ import unicode_literals
# ... many lines

def __str__(self):
    if self.parent is None:
        return ('{}'.format(self.name))
    else:
        return ('%s%s%s' % (
            self.parent,
            settings.PARENT_DELIMITER,
            self.name)
        )

I only removed the __str__() call and added the future directive because the models.py is the first module where it is especially useful. Otherwise you should add u' to both format strings here.

It is useful to know what the python_2_unicode_compatible decorator does. The result of __str__ method should be text_type (unicode in Python 2), but if you call it directly in Python 2 you get bytes type. The formatting operator selects the matching method but any explicit method is invalid either in Python 3 or in 2. Non ascii values of different types can not be combined.

Upvotes: 4

zsepi
zsepi

Reputation: 1662

have you tried using six.text_type?

six.text_type

Type for representing (Unicode) textual data. This is unicode() in Python 2 and str in Python 3.

edit: you don't have to install six, as the needed methods are present in django.utils.six - thanks @hynekcer for pointing it out

Six provides simple utilities for wrapping over differences between Python 2 and Python 3. It is intended to support codebases that work on both Python 2 and 3 without modification. six consists of only one Python file, so it is painless to copy into a project.~

Upvotes: 1

Eugene Lisitsky
Eugene Lisitsky

Reputation: 12855

According to error message test tries to compare str and unicode objects. That's not good generally.

AssertionError: 'Category 1->Category \xc3\xbc' != u'Category 1->Category \xfc'

Try to:

  • compare unicode objects: self.assertEqual(self.cat1, cat_result1)
  • always work with unicode locale even if at a moment you are using only latin1-symbols

Upvotes: 0

Related Questions