iamtheoracle
iamtheoracle

Reputation: 317

Postgres Explain Plans Are DIfferent

I have 2 databases running on the same server. Both are development databases with the same structure, but different amounts of data. Running Postgres 9.2 on Linux (Centos/Redhat)

I'm running the following SQL on each database, but getting very different results in the performance.

Note: the write_history table structure is identical in each DB and have the same indexes/constraints.

Here is the SQL:

 explain analyze SELECT * FROM write_history WHERE fk_device_rw_id =
 'd2c969b8-2609-11e3-80ca-0750cff1e96c'  AND fk_write_history_status_id
 = 5  ORDER BY update_time DESC LIMIT 1 ;

And the Explain Plans for each DB:

DB1 - PreProd

 Limit  (cost=57902.39..57902.39 rows=1 width=103) (actual
 time=0.056..0.056 rows=0 loops=1)'   ->  Sort 
 (cost=57902.39..57908.69 rows=2520 width=103) (actual
 time=0.053..0.053 rows=0 loops=1)'
         Sort Key: update_time'
         Sort Method: quicksort  Memory: 25kB'
         ->  Bitmap Heap Scan on write_history  (cost=554.04..57889.79 rows=2520 width=103) (actual time=0.033..0.033 rows=0 loops=1)'
               Recheck Cond: (fk_device_rw_id = 'd2c969b8-2609-11e3-80ca-0750cff1e96c'::uuid)'
               Filter: (fk_write_history_status_id = 5)'
               ->  Bitmap Index Scan on idx_write_history_fk_device_rw_id  (cost=0.00..553.41 rows=24034
 width=0) (actual time=0.028..0.028 rows=0 loops=1)'
                     Index Cond: (fk_device_rw_id = 'd2c969b8-2609-11e3-80ca-0750cff1e96c'::uuid)' Total runtime: 0.112 ms'

DB2 - QA

 Limit  (cost=50865.41..50865.41 rows=1 width=108) (actual
 time=545.521..545.523 rows=1 loops=1)'   ->  Sort 
 (cost=50865.41..50916.62 rows=20484 width=108) (actual
 time=545.518..545.518 rows=1 loops=1)'
         Sort Key: update_time'
         Sort Method: top-N heapsort  Memory: 25kB'
         ->  Bitmap Heap Scan on write_history  (cost=1431.31..50762.99 rows=20484 width=108) (actual time=21.931..524.034 rows=22034
 loops=1)'
               Recheck Cond: (fk_device_rw_id = 'd2cd81a6-2609-11e3-b574-47328bfa4c38'::uuid)'
               Rows Removed by Index Recheck: 1401986'
               Filter: (fk_write_history_status_id = 5)'
               Rows Removed by Filter: 40161'
               ->  Bitmap Index Scan on idx_write_history_fk_device_rw_id  (cost=0.00..1426.19 rows=62074
 width=0) (actual time=19.167..19.167 rows=62195 loops=1)'
                     Index Cond: (fk_device_rw_id = 'd2cd81a6-2609-11e3-b574-47328bfa4c38'::uuid)' Total runtime: 545.588 ms'

So a few questions:

  1. what is the difference between "Sort Method: quicksort Memory: 25kB" and "Sort Method: top-N heapsort Memory: 25kB" ?

  2. What could be the cause for the total runtime to be so different?

Table Row Count:

DB1 : write_history row count: 5,863,565

DB2 : write_history row count: 2,670,888

Please let me know if further information is required. Thanks for the help.

Upvotes: 2

Views: 1212

Answers (2)

jjanes
jjanes

Reputation: 44167

A top-N sort means that it is sorting in support of an ORDER BY ... LIMIT N, and that it will throw away any tuples once it can show that tuple cannot be in the top N. The decision to switch to a top-N sort is made dynamically during the sort process. Since that sort had zero input tuples, it never made the decision to switch to it. So the difference in the reported methods is an effect, not a cause; and is no importance to this case.

I think the key for you is the bitmap heap scans:

(actual time=0.033..0.033 rows=0 loops=1)

(actual time=21.931..524.034 rows=22034 loops=1)

The smaller database has a lot more rows which match your criteria, so has a lot more work to do.

Also, the amount of rechecking work that needs to be done:

Rows Removed by Index Recheck: 1401986

suggests you have work_mem set to an very small value of work_mem, so your bitmap is overflowing it.

Upvotes: 1

Philip Couling
Philip Couling

Reputation: 14893

1) Two different sorting algorithms.

http://en.wikipedia.org/wiki/Quicksort

http://en.wikipedia.org/wiki/Heapsort

2) Volume of data being sorted will change the decision.

From memory of database theory; QuickSort performs very badly on large volumes of data due to the sequence it scans through the data. It's particularly bad if sorting requires caching to disk. Heap sort has a much better worst case than quick sort. The query planner knows this and will change the plan accordingly.

The number of rows being sorted is 8X larger. Be very careful to notice the number of rows expected to be selected as well as the number of rows in the table. The query planner uses histograms of the data to get a better picture of what will be selected not simply the number of rows in the table. In DB1 it's 2.5K, in DB2 it's 20K

You can check if these estimates are correct by checking if the resulting set of rows matches the count in the estimate. If not then consider performing an ANALYZE

Upvotes: 1

Related Questions