user3393089
user3393089

Reputation: 189

postgresql function or query performance

I have some general queries on executing Postgres function.I recently noticed that if I store the output of any arithmetic or business operation in a variable and then call it at the time of execution in the query instead of doing the operation at time of execution it saved lot of time.

But I am not aware of any practices to be followed in general to reduce the time taken and also improve performance as I am new to Postgres.

Upvotes: 0

Views: 254

Answers (2)

Craig Ringer
Craig Ringer

Reputation: 325061

Beware of read-modify-write cycles and transaction anomalies.

It's fine to cache values locally so long as you're careful about the scope with which you cache it and with cache invalidation. Be very careful about storing values past the lifetime of the transaction you read it in.

Read-modify-write

You must also be careful not to use that cached value as an input into a calculation that you write back to the database unless you SELECT ... FOR UPDATE the value in a transaction that stays open during the write, you use a SERIALIZABLE transaction, or you use some form of optimistic concurrency control.

If you aren't careful you can get yourself in real trouble, with classics like the banking concurrency example where account id=1 transfers $100 to accounts id=2 and id=3:

session1                                  session2

begin;                                    begin;

select balance 
from account
where id=1;  => 100

                                          select balance
                                          from account
                                          where id = 1;   => 100

update account
set balance = balance + 100
where id = 2;  -- this is safe


                                          update account
                                          set balance = balance + 100
                                          where id = 3;  -- this is safe

update account
set balance = 0 -- 100 - 100 = 0
where id = 1;

                                          update account
                                          set balance = 0 -- 100 - 100 = 0
                                          where id = 1;

commit;
                                          commit;

Whoops! You just added $100 to two people's accounts but only took $100 out of id=1's. The updates against id=2 and id=3 were OK because they did an in-place modification of the balance (balance = balance + 100). The updates to id=1 were not, because they read the value, modified it client side, and wrote a new value.

This is what I mean by a read-modify-write cycle.

It would've been safe if we'd used SELECT ... FOR UPDATE when reading the balance, because the second transaction would've got stuck until the 1st committed. But it would've been better still if we'd avoided the read-copy-write cycle and just done the updates in-place.

Caching

Caching is fine - but can introduce anomalies when the underlying data is updated but your cache doesn't get flushed and refreshed.

Cache invalidation is, in general, a hard problem, but Pg has some tools that help.

In particular, listen and notify, invoked from triggers, can be used to eagerly flush data from a cache stored in memcached/redis/whatever via a helper daemon. That means you're much less likely to have to flush large chunks of cache or drop the whole cache whenever something changes.

You also need to make decisions about how out of date it's acceptable for something to be. Sometimes you just don't care if a value is 5 seconds out of date. Or half an hour. Or a week. It depends on the application, the datum in question, etc.

Upvotes: 2

There's nothing particularly wrong with storing values in variables.

If you're storing values so you can write SQL in a procedural, step-by-step way instead of a set-oriented way, then you'd probably be better off not doing that. SQL is a set-oriented language; it usually performs better if you write set-oriented SQL.

The risk of storing values and using them later is that the data underlying those values might have changed since you stored them. Whether that's a real problem is application-specific, but it's usually a problem best avoided.

Upvotes: 0

Related Questions