Reputation: 3391
I have created a very basic script that periodically writes some data into a database:
test.py
import sqlite3
import sys
import time
DB_CREATE_TABLE = 'CREATE TABLE IF NOT EXISTS items (item TEXT)'
DB_INSERT = 'INSERT INTO items VALUES (?)'
FILENAME = 'test.db'
def main():
index = int()
c = sqlite3.connect(FILENAME)
c.execute(DB_CREATE_TABLE)
c.commit()
while True:
item = '{name}_{index}'.format(name=sys.argv[1], index=index)
c.execute(DB_INSERT, (item,))
c.commit()
time.sleep(1)
index += 1
c.close()
if __name__ == '__main__':
main()
Now I can achieve a simple concurrency by running the script several times:
python3 test.py foo &
python3 test.py bar &
I have tried to read some articles about scripts writing into the same database file at same time but still I'm not sure how will my script handle such event and I didn't figure any way how to test it.
My expectations are that in the unlikely event when the two instances of my script try to write to the database in the same millisecond, the later one will simply silently wait till the earlier finishes its job.
Does my current implementation meet my expectations? If it does not, how does it behave in case of such event and how can I fix it?
Upvotes: 7
Views: 1210
Reputation: 3391
This script will meet the expectations.
When the unlikely event of two script instances trying to write at the same time happens, the first one locks the database and the second one silently waits for a while until the first one finishes its transaction so that the database is unlocked for writing again.
More precisely, the second script instance waits for 5 seconds (by default) and then raises the OperationalError
with the message database is locked
. As @roganjosh commented, this behavior is actually specific for a Python SQLite wrapper. The documentation states:
When a database is accessed by multiple connections, and one of the processes modifies the database, the SQLite database is locked until that transaction is committed. The timeout parameter specifies how long the connection should wait for the lock to go away until raising an exception. The default for the timeout parameter is 5.0 (five seconds).
To demonstrate the collision event of the two instances I modified the main
function:
def main():
c = sqlite3.connect(FILENAME)
c.execute(DB_CREATE_TABLE)
c.commit()
print('{} {}: {}'.format(time.time(), sys.argv[1], 'trying to insert ...'))
try:
c.execute(DB_INSERT, (sys.argv[1],))
except sqlite3.OperationalError as e:
print('{} {}: {}'.format(time.time(), sys.argv[1], e))
return
time.sleep(int(sys.argv[2]))
c.commit()
print('{} {}: {}'.format(time.time(), sys.argv[1], 'done'))
c.close()
The documentation states that the database is locked until the transaction is commited. So simply sleeping during the transaction should be enough to test it.
We run the following command:
python3 test.py first 10 & sleep 1 && python3 test.py second 0
The first instance is being run and after 1s the second instance is being run. The first instance creates a 10s long transaction during which the second one tries to write to the database, waits and then raises an exception. The log demonstrates that:
1540307088.6203635 first: trying to insert ...
1540307089.6155508 second: trying to insert ...
1540307094.6333485 second: database is locked
1540307098.6353421 first: done
We run the following command:
python3 test.py first 3 & sleep 1 && python3 test.py second 0
The first instance is being run and after 1s the second instance is being run. The first instance creates a 3s long transaction during which the second one tries to write to the database and waits. Since it has been created after 1s it has to wait 3s - 1s = 2s which is less than the default 5s so both transactions will finish successfully. The log demonstrates that:
1540307132.2834115 first: trying to insert ...
1540307133.2811155 second: trying to insert ...
1540307135.2912169 first: done
1540307135.3217440 second: done
The time needed for the transaction to finish is significantly smaller (milliseconds) than the lock time limit (5s) so in this scenario the script indeed meets the expectations. But as @HarlyH. commented, the transactions wait in a queue to be commited so for a heavily used or very large database this is not a good solution since the communication with the database will become slow.
Upvotes: 7