Reputation: 121
I can do very efficient bulk inserts in Sqlite3 on Python (2.7) with this code:
cur.executemany("INSERT INTO " + tableName + " VALUES (?, ?, ?, ?);", data)
But I can't get updates to work efficiently. I thought it might be a problem of the database structure/indexing, but even on a test database with only one table of 100 rows, the update still takes about 2-3 seconds.
I've tried different code variations. The latest code I have is from this answer to a previous question about update and executemany, but it's just as slow for me as any other attempt I've made:
data = []
for s in sources:
source_id = s['source_id']
val = get_value(s['source_attr'])
x=[val, source_id]
data.append(x)
cur.executemany("UPDATE sources SET source_attr = ? WHERE source_id = ?", data)
con.commit()
How could I improve this code to do a big bulk update efficiently?
Upvotes: 7
Views: 3212
Reputation: 180020
When inserting a record, the database just needs to write a row at the end of the table (unless you have something like UNIQUE constraints).
When updating a record, the database needs to find the row. This requires scanning through the entire table (for each command), unless you have an index on the search column:
CREATE INDEX whatever ON sources(source_id);
But if source_id
is the primary key, you should just declare it as such (which creates an implicit index):
CREATE TABLE sources(
source_id INTEGER PRIMARY KEY,
source_attr TEXT,
[...]
);
Upvotes: 1