Reputation:
I know that executemany
can be used to conveniently add new entries to a database; useful to reduce the Python method-call overheads compared to single execute
s in a for loop. However, I am wondering if this could work with SQLite's UPDATE
.
More concretely, consider the following setup:
cnx = sqlite3.connect(DATABASE)
c = cnx.cursor()
for path in paths:
for data in some_computation(path):
c.execute("UPDATE TABLENAME SET cont=? WHERE id=?", (data[1], data[0]))
cnx.commit()
cnx.close()
I am not even sure if the approach below would be any faster (would have to benchmark it), but the problem is that it doesn't work, because I am doing it incorrectly I assume. Any tips to use executemany
in the code snippet below to accomplish the task that I posted above?
cnx = sqlite3.connect(DATABASE)
c = cnx.cursor()
for path in paths:
data_ids, data_conts = [], []
for data in some_computation(path):
if len(data_ids) >= CHUNKSIZE:
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", (data_conts, data_ids))
cnx.commit()
data_ids, data_conts = [], []
data_ids.append(data[0])
data_conts.append(data[1])
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", (data_conts, data_ids))
cnx.commit()
cnx.commit()
cnx.close()
Many thanks for tips and insights!
EDIT 1:
The problem with the bottom example:
ProgrammingError: Incorrect number of bindings supplied. The current statement uses 2, and there are 50000 supplied.
(where CHUNKSIZE=50000)
EDIT 2:
The same error occurs
cnx = sqlite3.connect(DATABASE)
c = cnx.cursor()
for path in paths:
data_conts = []
for data in some_computation(path):
if len(data_ids) >= CHUNKSIZE:
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", (data_conts,))
cnx.commit()
data_conts = []
data_conts.append([data[1], data[0]])
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", (data_conts,))
cnx.commit()
cnx.commit()
cnx.close()
but thanks to @falsetru I then noticed my error, It should be
... WHERE id=?", data_conts)
and not
... WHERE id=?", (data_conts,))
Upvotes: 14
Views: 21133
Reputation: 2072
what you have is perfect except that you should use zip(conts,ids) where conts and ids are lists. This does the rearrangement automatically for you.
Upvotes: 0
Reputation: 369494
You need to pass a sequence of sequences ([[cont,id], [cont,id], [cont,id], ...]
, not [cont, cont, cont, ...], [id, id, id, ..]
):
for path in paths:
params = []
for data in some_computation(path):
if len(data_ids) >= CHUNKSIZE:
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", params)
cnx.commit()
params = []
params.append([data[1], data[0]])
if params:
c.executemany("UPDATE TABLENAME SET cont=? WHERE id=?", params)
cnx.commit()
Upvotes: 25