Reputation: 83
I'm attempting to parallelize a python script by using a "tasks" table in mariadb where each row describes a task and each running instance of my script does something like this simplified python + peewee code:
with db.transaction() as txn:
result = list(Table.raw("select id from tasks where started=0 limit 1"))
if len(result) == 0:
sys.exit("nothing left to do")
my_next_task = result[0]
# mark this task as started
Table.raw("update tasks set started=1 where id=%s" % my_next_task.id)
# process my_next_task
However, when I start 2 instances of my script simultaneously, they both start working on the same task. Am I misunderstanding how transactions are supposed to work in this case?
Upvotes: 1
Views: 261
Reputation: 893
Try this. The following code is guaranteed to have only 1 thread acquire the lock before executing the task
1.The fact that you are doing a transaction, makes sure only 1 thread can do an update at a time
2.Note that, we need to check for started = 0. This makes sure only 1 thread can do the update
with db.transaction() as txn:
result = list(Table.raw("select id from tasks where started=0 limit 1"))
// The update query should return number of rows updated. So if a thread updates the task, it should return 1, which means it acquired a lock. If not, it means the thread didn't acquire the lock
got_lock = Table.raw("update tasks set started=1 where id=%s and started = 0" % my_next_task.id) > 0
if len(result) == 0:
sys.exit("nothing left to do")
my_next_task = if(got_lock) result[0] else return
Upvotes: 1