Svan
Svan

Reputation: 111

Work on a Django database without modifying it

I'm developing optimization algorithms which operate on data stored in a postgres django database. My algorithms have to repeatedly modify the objects in the database and sometimes revert the change done (it is metaheuristic algorithms, for those who knows). The problem is that I don't want to save the modification on the postgres database during the process. I would like to save the modifications at the end of the process, when i'm satisfied with the results of the optimization. I think that the solution is to load all concerned objects in memory, work on them, and save the objects in memory to the database at the end.

However it seems to be more difficult than I thought...

Indeed, when I will make a django query (ie. model1.objects.get or model.objects.filter), I fear that django call the objects sometimes in database and sometimes in it's cache, but I'm pretty sure that in some case it will not be the same than the instances I manually loaded in memory (which are the ones on which I want to work because they may have changed since the load from the database) ...

Is there a way to bypass such problems ?

I implemented a kind of custom mini-database which works but it's becoming too difficult to maintain and over all, I think it's not the most simple and elegant way to proceed. I thought to dump the concerned model of the postgres database into an in-memory one (for performance), work on this in-memory db and when finishing my algorithm, update the data of the original database from the data in the in-memory one (it would imply that django keeps a link, perhaps through the pk, of the original objects with those in the in-memory database to identify which are the same and I don't know if it's possible).

Does someone has an insight?

Thank you in advance.

Upvotes: 1

Views: 73

Answers (1)

e4c5
e4c5

Reputation: 53734

What you are looking for is transactions. One of the most powerfull features of an RDBS. Simply use START TRANSACTION before you start playing around with the data. At the end if you are happy with it use COMMIT. If you don't want your django app to see the changes use ROLLBACK.

Due to the default transaction isolation level of postgresql, your django app will not see whatever changes you are doing elsewhere until it's committed. At the same time what ever changes you do in your sql console or with other code will be visible to that code even though it's not committed.

Read Committed is the default isolation level in PostgreSQL. When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run. However, SELECT does see the effects of previous updates executed within its own transaction, even though they are not yet committed

Upvotes: 1

Related Questions