user2880153
user2880153

Reputation: 11

postgres 9.1 performance parameters

To improve the performance, I tuned the parameters shared_buffers and work_mem. Though i tried for different combinations of above parameters, my query taking the same time. I restarted the server when ever the 'shared_buffer' is changed My Environment is

Postgres 9.1 OS windows 7 RAM 8GB

I tried shared_buffers for 1gb,2gb and 3gb. But query execution time changes very minimally( in milli seconds)

I am giving the explain plan of query

GroupAggregate  (cost=11100.99..12737.10 rows=50342 width=24) (actual time=181.789..268.733 rows=49829 loops=1)
  ->  Sort  (cost=11100.99..11226.84 rows=50342 width=24) (actual time=181.761..188.323 rows=50000 loops=1)    
        Sort Key: (date(t_proof_of_play.play_timestamp)), (date_trunc(''hour''::text, t_proof_of_play.play_timestamp)), t_device_endpoint.venue_hierarchy_id, t_device_endpoint.device_config_id, t_proof_of_play.asset_id
        Sort Method: quicksort  Memory: 5443kB
        ->  Hash Join  (cost=2629.37..7169.41 rows=50342 width=24) (actual time=15.416..112.175 rows=50000 loops=1)
              Hash Cond: (t_proof_of_play.device_endpoint_id = t_device_endpoint.id)
              ->  Bitmap Heap Scan on t_proof_of_play  (cost=2628.28..6224.41 rows=50342 width=20) (actual time=15.332..29.004 rows=50000 loops=1
                    Recheck Cond: ((is_processed = 0) AND (is_revenue = 1))
                    -> Bitmap Index Scan on process_revenue  (cost=0.00..2615.69 rows=50342 width=0) (actual time=15.081..15.081 rows=50000 loops=1)
                          Index Cond: ((is_processed = 0) AND (is_revenue = 1))
              ->   Hash  (cost=1.04..1.04 rows=4 width=12) (actual time=0.026..0.026 rows=4 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 1kB
                    ->  Seq Scan on t_device_endpoint  (cost=0.00..1.04 rows=4 width=12) (actual time=0.009..0.015 rows=4 loops=1)
Total runtime: 276.027 ms

Tabulated version on explain.depesz

The suggestions are welcomed. Thanks in advance

regards

Upvotes: 1

Views: 845

Answers (3)

regilero
regilero

Reputation: 30496

In complement to @Richard Huxton answer.

As stated in the documentation on Windows you cannot use a very high amount of shared_buffers:

Also, on Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the operating system cache more instead. The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB.

So you should set this parameter to 512MB, PostgreSQL will not be able to optimize his memory usage with more shared memory on Windows. In my mind Windows is not the best OS for a PostgreSQL server, anyway you could also try to tweak effective_cache_size to 50% or 75% of available memory (so maybe something like 2 or 4GB). This is a hint for PostgreSQL, telling it the OS will certainly use available RAM to store recently used files. This way your query optimizer will maybe decide that accessing disk data is not as costly as it usually think.

Upvotes: 1

kgrittn
kgrittn

Reputation: 19471

As @RichardHuxton points out, you are retrieving results at about 5 microseconds per row. It's hard to complain too much about that; but that rate proves that you have a very high cache hit ratio. You might want to try adjusting the cost factors on a connection to reflect that (using the SET command), and try the query again to see if you get a better plan.

SET cpu_tuple_cost = 0.03;
SET random_page_cost = 1;
SET effective_cache_size = '4GB';
-- (your query here)

Of course, @regilero is right about shared_buffers -- on Windows it must be kept small

Even fairly simple queries rarely take less than 2 microseconds per result row, so on a query with a join, a sort, and an aggregate you don't have a whole lot of room for improvement.

Upvotes: 3

Richard Huxton
Richard Huxton

Reputation: 22893

It can be difficult to pick out the important figures from a raw EXPLAIN. Luckily, a long-time community member built a useful tool to tabulate it for us - I added a link to depesz.com.

Firstly, I can't see how changing shared_buffers would have any effect on any particular query and in any case you didn't provide values. It's a general value to change the overall balance of memory usage on the machine. You would tune it based on the overall workload rather than a particular query.

The three major parts of the query are the join, the sort and the aggregate. That seems reasonable.

I don't know what aggregate you're running (you didn't give the query). Presumably you're summarising over day/hour on play_timestamp (based on the sort). Do you need both levels?

For your fairly detailed sort (5 columns, two based on function-calls) as long as work_mem is big enough for the sort there's nothing much to do there.

So - we might be able to improve the join. It's possible that a partial index might help. Something like:

CREATE INDEX idx1 ON t_proof_of_play (device_endpoint_id) WHERE is_processed=0 AND is_revenue=1

Of course, nothing is free and you'll be paying the maintenance costs on this index whenever you update t_proof_of_play.

Even then, if you halve the join-time that is only going to reduce the overall time from 276ms to about 235ms. You're processing 50,000 rows in about 0.25 of a second - that's 5 micro-seconds per row.

Upvotes: 2

Related Questions