Dr Hydralisk
Dr Hydralisk

Reputation: 33

Inserting unicode into sqlite?

I am still learning Python and as a little Project I wrote a script that would take the values I have in a text file and insert them into a sqlite3 database. But some of the names have weird letter (I guess you would call them non-ASCII), and generate an error when they come up. Here is my little script (and please tell me if there is anyway it could be more Pythonic): import sqlite3

f = open('complete', 'r')
fList = f.readlines()
conn = sqlite3.connect('tpb')
cur = conn.cursor()

for i in fList:
    exploaded = i.split('|')
    eList = (
        (exploaded[1], exploaded[5])
    )
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
    conn.commit()
cur.close()

And it generates this error:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\sortinghat.py", line 13, in <module>
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te
xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is
highly recommended that you instead just switch your application to Unicode str
ings.

Upvotes: 3

Views: 3111

Answers (1)

mechanical_meat
mechanical_meat

Reputation: 169264

To get the file contents into unicode you need to decode from whichever encoding it is in.
It looks like you're on Windows so a good bet is cp1252.
If you got the file from somewhere else all bets are off.

Once you have the encoding sorted, an easy way to decode is to use the codecs module, e.g.:

import codecs
# ...
with codecs.open('complete', encoding='cp1252') as fin: # or utf-8 or whatever
  for line in fin:
    to_insert = (line.split('|')[1], line.split('|')[5])
    cur.execute('INSERT INTO magnets VALUES (?,?)', to_insert)
    conn.commit()
# ...

Upvotes: 4

Related Questions