Reputation: 1452
The data stored in unicode (in database) has to be retrieved and convert into a different form.
The following snippet
def convert(content):
content = content.replace("ஜௌ", "n\[s");
return content;
mydatabase = "database.db"
connection = sqlite3.connect(mydatabase)
cursor = connection.cursor()
query = ''' select unicode_data from table1'''
cursor.execute(query)
for row in cursor.fetchone():
print convert(row)
yields the following error message in convert method.
exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)
If the database content is "ஜௌஜௌஜௌ", the output should be "n\[sn\[sn\[s"
The documentation suggests to use ignore or replace to avoid the error, when creating the unicode string.
when the iteration is changed as follows:
for row in cursor.fetchone():
print convert(unicode(row, errors='replace'))
it returns
exceptions.TypeError: decoding Unicode is not supported
which informs that row is already a unicode.
Any light on this to make it work is highly appreciated. Thanks in advance.
Upvotes: 1
Views: 1558
Reputation: 536379
content = content.replace("ஜௌ", "n\[s");
Suggest you mean:
content = content.replace(u'ஜௌ', ur'n\[s');
or for safety where the encoding of your file is uncertain:
content = content.replace(u'\u0B9C\u0BCC', ur'n\[s');
The content you have is already Unicode, so you should do Unicode string replacements on it. "ஜௌ"
without the u
is a string of bytes that represents those characters in some encoding dependent on your source file charset. (Byte strings work smoothly together with Unicode strings only in the most unambiguous cases, which is for ASCII characters.)
(The r
-string means not having to worry about including bare backslashes.)
Upvotes: 2