Reputation: 3
i´m trying to write a little script for writting a sqlite table from an archive list saved in a file. the code so far is this:
import os import _sqlite3 import sys
print sys.path[0] mydir = sys.path[0] print (mydir) def listdir(mydir):
lis=[]
for root, dirs, files in os.walk(mydir):
for name in files:
lis.append(os.path.join(root,name))
return lis
filename = "list.txt" print ("writting in %s" % filename) file = open(filename, 'w' ) for i in listdir(mydir):
file.write(i)
file.write("\n") file.close()
con =
_sqlite3.connect("%s/conection"%mydir) c=con.cursor()
c.execute(''' drop table files ''') c.execute('create table files (name text, other text)') file = open(filename,'r') for line in file :
a = 1
for t in [("%s"%line, "%i"%a)]:
c.execute('insert into files values(?,?)',t)
a=a+1 c.execute('select * from files') print c.fetchall() con.commit() c.close()
when i run i get the following:
Traceback (most recent call last): File "C:\Users\josh\FORGE.py", line 32, in <module>
c.execute('insert into files values(?,?)',t) ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
i´ve tried with the unicode() built in function but still won´t work, saying that he can´t decode the character 0xed or something.
I know the problem is on the encoding of the list strings, but i can´t find a way to put them right. any ideas? thanks in advance!
Upvotes: 0
Views: 225
Reputation: 7033
(zero). please reformat your code
after for line in file:
do something like line = line.decode('encoding-of-the-file')
, with encoding being something like utf-8
, or iso-8859-1
-- you have to know your input encoding
If you don't know the encoding or not care about having a clean decoding, you can guess the most probable encoding and do a line.decode('uft-8', 'ignore')
, omitting all characters not decodable. Also, you can use 'replace'
, which replaces these chars with the 'Unicode Replacement Character' (\ufffd)
use internally and during communication with the database only unicode
objects, e.g. u'this is unicode'
(3). Don't use file
as variable name
also look here: Best Practices for Python UnicodeDecodeError
Upvotes: 1