Vincent
Vincent

Reputation: 1157

GAE Python: Cannot figure out how to store UTF-8 in Stringproperty

I know that this topic has been addressed by many, but for some reason I cannot get UTF-8 encoding to work on my GAE app. I am retrieving a German string from an online form and then try to store it in a Stringproperty. The code looks as follows:

import from google.appengine.ext import db
import webapp2

class Item(db.Model):
  value = db.Stringproperty()

class ItemAdd(webapp2.RequestHandler):
    def post(self):
       item - Item()
       value = str(self.request.get(u'value'))
       item.value = value.encode('utf-8')
       item.put()

The error I get from this is:

File "C:\xxx", line 276, in post
value = str(self.request.get('value'))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 12: ordinal not in range(128)

Does anybody see what I am doing wrong?

UPDATE

The string I am retrieving is following: "Dit is een länge" If I change the property type to TextProperty everything works, however I need to be able to filter on it so this doesn't solve the problem.

Upvotes: 1

Views: 525

Answers (2)

voscausa
voscausa

Reputation: 11706

Webapp2 takes care of utf-8. In your post webapp2 gives you an utf-8 multidict. So you do not have to do it yourself. With a debugger you can find the multidict in the self.request

class ItemAdd(webapp2.RequestHandler):

    def post(self):
       Item(value = self.request.POST('value')).put()

To use utf-8 read this sblog post and never use : str() !!!! Your str() makes binary out of unicode http://blog.notdot.net/2010/07/Getting-unicode-right-in-Python

And with python27 you can start your code with :

#!/usr/bin/python
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

Upvotes: 2

eLRuLL
eLRuLL

Reputation: 18799

When your python script receives data, strings, you have to be careful that the encoding of the file is the same that it always receives, maybe you should add this to the top of the file:

#!/usr/bin/python
# -*- coding: utf-8 -*-

Upvotes: -1

Related Questions