Stone Lee
Stone Lee

Reputation: 51

Why less than (<) has much better performance than greater than (>) in a query?

In the following queries, you'll see that the query using the < operator costs only 0.007s, while the one with the > operator costs 0.474s. That's a huge difference. What does it happen?

postgresql> SHOW server_version;
+------------------+
| server_version   |
|------------------|
| 9.4.17           |
+------------------+
SHOW
Time: 0.006s

postgresql> \d msg
+------------------+-----------------------------+-------------------------------------------------------------+
| Column           | Type                        | Modifiers                                                   |
|------------------+-----------------------------+-------------------------------------------------------------|
| id               | bigint                      |  not null default nextval('msg_id_seq'::regclass) |
| content          | text                        |  default ''::text                                           |
... ...
| created_at       | timestamp without time zone |  default timezone('UTC'::text, now())                       |
+------------------+-----------------------------+-------------------------------------------------------------+
Indexes:
    "msg_pkey" PRIMARY KEY, btree (id)
    "ix_msg_created_at" btree (created_at)

Time: 0.013s

postgresql> SELECT id FROM msg WHERE created_at < '2020-08-27' ORDER BY id desc LIMIT 1;
+--------+
| id     |
|--------|
| 973604 |
+--------+
SELECT 1
Time: 0.007s

postgresql> SELECT id FROM msg WHERE created_at > '2020-08-27' ORDER BY id LIMIT 1;
+--------+
| id     |
|--------|
| 973605 |
+--------+
SELECT 1
Time: 0.474s

postgresql> SELECT count(*) FROM msg WHERE created_at < '2020-08-27';
+---------+
| count   |
|---------|
| 967498  |
+---------+
SELECT 1
Time: 0.509s

postgresql> SELECT count(*) FROM msg WHERE created_at > '2020-08-27';
+---------+
| count   |
|---------|
| 1089    |
+---------+
SELECT 1
Time: 0.007s

postgresql> EXPLAIN ANALYZE SELECT id FROM msg WHERE created_at < '2020-08-27' ORDER BY id desc LIMIT 1;
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                                 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Limit  (cost=0.43..0.53 rows=1 width=8) (actual time=0.770..0.770 rows=1 loops=1)                                                                          |
|   ->  Index Scan Backward using msg_pkey on msg  (cost=0.43..114629.92 rows=1092141 width=8) (actual time=0.770..0.770 rows=1 loops=1) |
|         Filter: (created_at < '2020-08-27 00:00:00'::timestamp without time zone)                                                                          |
|         Rows Removed by Filter: 1103                                                                                                                       |
| Planning time: 0.110 ms                                                                                                                                    |
| Execution time: 0.782 ms                                                                                                                                   |
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.008s

postgresql> EXPLAIN ANALYZE SELECT id FROM msg WHERE created_at > '2020-08-27' ORDER BY id LIMIT 1;
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                         |
|----------------------------------------------------------------------------------------------------------------------------------------------------|
| Limit  (cost=0.43..44.58 rows=1 width=8) (actual time=481.168..481.169 rows=1 loops=1)                                                             |
|   ->  Index Scan using msg_pkey on msg  (cost=0.43..114629.92 rows=2596 width=8) (actual time=481.166..481.166 rows=1 loops=1) |
|         Filter: (created_at > '2020-08-27 00:00:00'::timestamp without time zone)                                                                  |
|         Rows Removed by Filter: 967498                                                                                                             |
| Planning time: 0.124 ms                                                                                                                            |
| Execution time: 481.184 ms                                                                                                                         |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.489s

Update to add rows counts (2020-09-01):

Update to add query plans (2020-09-01).

Upvotes: 2

Views: 1019

Answers (1)

Stone Lee
Stone Lee

Reputation: 51

As the outputs of EXPLAIN show, it seems that both queries scan index msg_pkey instead of ix_msg_created_at. As what Rows Removed by Filter shows, the query using the > operator scans 967498 rows and the one using the < operator scans only 1103 rows. And this should explain the big performance difference.

Thanks for all the helps and suggestions.

Upvotes: 3

Related Questions