Alberto Megía
Alberto Megía

Reputation: 2255

PYTHON: Appending unicode to list changes its type to str

This is my scenario: I have a list of Unicode strings. I receive a utf8 string named 'input' and want to log it with the rest of the elements in the list. So I decode it and get a Unicode object, but when I append it to the list, its type changes to "str". What is happening here?

a_list = [u"ááááá", u"eééééée"]
#'input' is a utf8 str
obj = input.decode("utf-8")
log.debug(type(obj))
log.debug(obj)
a_list.append(obj)
for elem in a_list:
    log.debug(type(elem))

Log:

DEBUG - <type 'unicode'>' # obj
<(THIS IS ONLY FOR SIMPLIFY) obj with accents (unicode chars)>
DEBUG - <type 'unicode'>'
DEBUG - <type 'unicode'>'
DEBUG - <type 'str'>'   # ------> obj's type changed!!!

EDIT: Python 2.7.3

input is "request.data" from a request object in Flask microframework

Upvotes: 1

Views: 4396

Answers (1)

Lennart Regebro
Lennart Regebro

Reputation: 172249

There is no way this happens as you say. Here is code that reproduces your behavior:

# -*- coding: UTF-8 -*-
a_list = [u"ááááá", u"eééééée"]
input = '\xc3\x85 i \xc3\xa5a \xc3\xa4 e \xc3\xb6'
obj = input.decode("utf-8")
print type(obj)
print obj
a_list.append(obj)
for elem in a_list:
    print type(elem)

And here is the output:

<type 'unicode'>
Å i åa ä e ö
<type 'unicode'>
<type 'unicode'>
<type 'unicode'>

Some of the objects involved must be other than the builtin types, that make this conversion happen.

Unless you simply add input to the list where you meant to add obj. :-)

Upvotes: 2

Related Questions