Reputation: 155
Lets say I have a database table which consists of three columns: id
, field1
and field2
. This table may have anywhere between 100 and 100,000 rows in it. I have a python script that should insert 10-1,000 new rows into this table. However, if the new field1
already exists in the table, it should do an UPDATE
, not an INSERT
.
Which of the following approaches is more efficient?
SELECT field1 FROM table
(field1
is unique) and store that in a list. Then, for each new row, use list.count()
to determine whether to INSERT
or UPDATE
SELECT count(*) FROM table WHERE field1="foo"
then either the INSERT
or UPDATE
.In other words, is it more efficient to perform n+1 queries and search a list, or 2n queries and get sqlite to search?
Upvotes: 2
Views: 3031
Reputation: 63616
You appear to be comparing apples with oranges.
A python list is only useful if your data fit into the address-space of the process. Once the data get big, this won't work any more.
Moreover, a python list is not indexed - for that you should use a dictionary.
Finally, a python list is non-persistent - it is forgotten when the process quits.
How can you possibly compare these?
Upvotes: 0
Reputation: 702
I'm not familiar with sqlite but a general approach like this should work:
If there's a unique index on field1
and you're trying to insert a value that's already there you should get an error. If insert fails, you go with the update.
Pseudocode:
try
{
insert into table (value1, value2)
}
catch(insert fails)
{
update table set field2=value2 where field1=value1
}
Upvotes: 1
Reputation: 1825
If I understand your question correctly, it seems like you could simply use SQLite's built in conflict handling mechanism.
Assuming you have a UNIQUE constraint on field1, you could simple use:
INSERT OR REPLACE INTO table VALUES (...)
The following syntax is also supported (identical semantics):
REPLACE INTO table VALUES (...)
EDIT: I realise that I am not really answering your question, just providing an alternative solution which should be faster.
Upvotes: 9
Reputation: 104090
I imagine using a python dictionary would allow for much faster searching than using a python list. (Just set the values to 0, you won't need them, and hopefully a '0' stores compactly.)
As for the larger question, I'm curious too. :)
Upvotes: 0