Reputation: 16402
I have a web application that uses Flask, SQLAlchemy and WTForms, along with the necessary Flask extensions to make it all work. MySQL is using utf8_bin
for all tables and columns.
I inserted some Chinese characters and phpMyAdmin displays them correctly but whenever I try to open a page I get the following exception:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)
I understand I should decode('utf8')
the fields I want to display but shouldn't this be handled by SQLAlchemy for me?
The only way I managed to make this work was by iterating through the list of results and doing something similar to:
object.property = object.property.decode('utf8')
But obviously this shouldn't have to be done by hand. What am I missing?
Update: SQLAlchemy mapping
class Thread(db.Model):
__tablename__ = 'Thread'
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.Unicode(255), nullable=False)
body = db.Column(db.Text, nullable=True)
date_created = db.Column(db.DateTime, nullable=False, default=datetime.now())
created_by = db.Column(db.Integer, ForeignKey(User.id))
user = relationship(User, backref='threads')
display_hash = db.Column(db.Unicode(255), nullable=False, unique=True)
display_name = db.Column(db.Unicode(255), nullable=True)
nsfw = db.Column(db.Boolean, nullable=False, default=False)
last_updated = db.Column(db.DateTime, nullable=False)
def __init__(self, title=None, body=None, category_id=None, display_name=None):
self.title = title
self.body = body
self.category_id = category_id
self.display_name = display_name
self.display_hash = custom_uuid()
self.last_updated = self.date_created
def __repr__(self):
return u'<Thread %r>' % (self.title)
def url_title(self):
""" Generates an ASCII-only slug. """
result = []
for word in _punct_re.split(self.title.lower()):
result.extend(unidecode(word).split())
return unicode(u'-'.join(result))
Update: stack trace
`127.0.0.1 - - [06/Oct/2013 02:37:15] "GET /index HTTP/1.1" 500 -
Traceback (most recent call last):
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/Users/homedirectory/Projects/Assorted/Fruit Show/app/views.py", line 90, in index
return render_template('index.html', threads=threads, pagination=pagination)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/templating.py", line 128, in render_template
context, ctx.app)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/flask/templating.py", line 110, in _render
rv = template.render(context)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/jinja2/environment.py", line 969, in render
return self.environment.handle_exception(exc_info, True)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/jinja2/environment.py", line 742, in handle_exception
reraise(exc_type, exc_value, tb)
File "/Users/homedirectory/Projects/Assorted/Fruit Show/app/templates/index.html", line 1, in top-level template code
{% extends 'base.html' %}
File "/Users/homedirectory/Projects/Assorted/Fruit Show/app/templates/base.html", line 50, in top-level template code
{% block content %}
File "/Users/homedirectory/Projects/Assorted/Fruit Show/app/templates/index.html", line 14, in block "content"
<a href="{{ url_for('new_thread') }}/{{ thread.display_hash|safe }}/{{ thread.url_title()|safe }}">{{ thread.title|safe }}</a>
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/jinja2/filters.py", line 747, in do_mark_safe
return Markup(value)
File "/Users/homedirectory/.virtualenvs/fruitshow/lib/python2.7/site-packages/markupsafe/__init__.py", line 72, in __new__
return text_type.__new__(cls, base)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)`
Update: URL for project repo:
https://github.com/ruipacheco/fruitshow
Upvotes: 6
Views: 2509
Reputation: 20500
Not quite your answer, but let me recommend ftfy (Fix Text For You) which fixes a bunch of small unicode and html escaping issues. One truly annoying religious war in Unicode encoding is the inability of UTF-8 to deal with the various one byte character encodings such as Latin-1. Instead of just going "oh, that must be a simple Latin character", the decoder gets flustered. When your database driver makes the observation of "oh, this fits", it creates at fatwah.
Upvotes: 0
Reputation: 16402
The problem is with the MySQL driver I'm using.
I followed this answer and switching the column type from utf8_bin
to utf8_general_ci
did the trick.
Upvotes: 4
Reputation: 6450
A little suggestion for Slug field in your Models.
There is a Library called Webhelpers (https://pypi.python.org/pypi/WebHelpers), import that and your title will be automatically converted into the slug.
Install WebHelpers and then import urlify
from webhelpers.text import urlify
.
.
.
@property
def slug(self):
return urlify(self.title)
Upvotes: 2
Reputation: 156128
setting the charset in the connection parameters only tells mysql to transcode columns from however they are in the database to the requested format encoding. The data is still passed between MySQL and the client as bytes. In short, you have to tell sqlalchemy
that "this particular" data is unicode data (in the connection's encoding). For most of your columns, you have used Unicode
, which serves this purpose. A notable standout is body
, which is of type Text
. You probably want UnicodeText
or Text(convert_unicode=True)
Upvotes: 0