Reputation: 3742
I call a __repr__()
function on object x
as follows:
val = x.__repr__()
and then I want to store val
string to SQLite
database. The problem is
that val
should be unicode.
I tried this with no success:
val = x.__repr__().encode("utf-8")
and
val = unicode(x.__repr__())
Do you know how to correct this?
I'm using Python 2.7.2
Upvotes: 7
Views: 12064
Reputation: 3782
In Python2, you can define two methods:
#!/usr/bin/env python
# coding: utf-8
class Person(object):
def __init__(self, name):
self.name = name
def __unicode__(self):
return u"Person info <name={0}>".format(self.name)
def __repr__(self):
return self.__unicode__().encode('utf-8')
if __name__ == '__main__':
A = Person(u"皮特")
print A
In Python3, just define __repr__
will be ok:
#!/usr/bin/env python
# coding: utf-8
class Person(object):
def __init__(self, name):
self.name = name
def __repr__(self):
return u"Person info <name={0}>".format(self.name)
if __name__ == '__main__':
A = Person(u"皮特")
print(A)
Upvotes: 1
Reputation: 356
I was having a similar problem, because I was pulling the text out of a list using repr.
b =['text\xe2\x84\xa2', 'text2'] ## \xe2\x84\xa2 is the TM symbol
a = repr(b[0])
c = unicode(a, "utf-8")
print c
>>>
'text\xe2\x84\xa2'
I finally tried join to get the text out of the list instead
b =['text\xe2\x84\xa2', 'text2'] ## \xe2\x84\xa2 is the TM symbol
a = ''.join(b[0])
c = unicode(a, "utf-8")
print c
>>>
text™
Now it works!!!!
I tried several different ways. Each time I used repr with the unicode function it did not work. I have to use join or declare the text like in variable e below.
b =['text\xe2\x84\xa2', 'text2'] ## \xe2\x84\xa2 is the TM symbol
a = ''.join(b[0])
c = unicode(repr(a), "utf-8")
d = repr(a).decode("utf-8")
e = "text\xe2\x84\xa2"
f = unicode(e, "utf-8")
g = unicode(repr(e), "utf-8")
h = repr(e).decode("utf-8")
i = unicode(a, "utf-8")
j = unicode(''.join(e), "utf-8")
print c
print d
print e
print f
print g
print h
print i
print j
*** Remote Interpreter Reinitialized ***
>>>
'text\xe2\x84\xa2'
'text\xe2\x84\xa2'
textâ„¢
text™
'text\xe2\x84\xa2'
'text\xe2\x84\xa2'
text™
text™
>>>
Hope this helps.
Upvotes: 1
Reputation: 799210
The representation of an object should not be Unicode. Define the __unicode__
method and pass the object to unicode()
.
Upvotes: 16
Reputation: 29737
repr(x).decode("utf-8")
and unicode(repr(x), "utf-8")
should work.
Upvotes: 8