Reputation: 1205
Imagine a quite complex Django application with both frontend and backend parts. Some users modify some data on the frontend part. Some scripts modify the same data periodically on the backend part.
Example:
instance = SomeModel.objects.get(...)
# (long-running part where various fields are changed, takes from 3 to 20 seconds)
instance.field = 123
instance.another_field = 'abc'
instance.save()
If somebody (or something) changes the instance while that part is changing some fields, then the changes will be lost because the instance will be saved lately, dumping the data from the Python (Django) class. In other words, if something in the code takes data, then waits for some time, and then saves the data back - then only the latest 'saver' will save its data, all the others (previous) ones will lose their changes.
It's a "high-load" app, the database load (we use Postgres) is quite high and I'd like to avoid anything that would cause a significant increase of the DB activity or memory taken.
Another issue - we have many signals attached, and even the save() method overriden, so I'd like to avoid anything that might break the signals or might be incompatible with custom save() or update() methods.
What would you recommend in this situation? Any special app for that? Transactions? Anything else?
Thank you!
Upvotes: 0
Views: 519
Reputation: 9978
The correct way to protect against this is to use select_for_update
to make sure that the data doesn't change between reading and writing. However this causes the row to be locked for updates so this might slow down your application significantly.
Oen solution might be to read the data and perform your long-running tasks. Then before saving it back you start a transaction, read the data again but now with select_for_update
and verify that the original data hasn't changed. If the data is still the same then you save. If the data has changed you abort and re-run the long-running task. That way you will hold the lock for as short as possible.
Something like:
success = False
while not success:
instance1 = SomeModel.objects.get(...)
# (long-running part)
with django.db.transaction.atomic():
instance2 = SomeModel.objects.select_for_update().get(...)
# (compare relevant data from instance1 vs instance2)
if unchanged:
# (make the changes on instance2)
instance2.field = 123
instance2.another_field = 'abc'
instance2.save()
success = True
If this is a viable approach does depend on what exactly your long-running task is. And a user might still overwrite the data you save here.
Upvotes: 2