KutaBeach
KutaBeach

Reputation: 1495

Two questions on PostgreSQL performance

1) What is the best way to implement paging in PostgreSQL?

Assume we need to implement paging. The simplest query is select * from MY_TABLE order by date_field DESC limit 10 offset 20. As far as I understand, we have 2 problems here: in case the dates may have duplicated values every run of this query may return different results and the more offset value is the longer the query runs. We have to provide additional column which is date_field_index:

--date_field--date_field_index--
  12-01-2012     1
  12-01-2012     2
  14-01-2012     1
  16-01-2012     1
--------------------------------

Now we can write something like

create index MY_INDEX on MY_TABLE (date_field, date_field_index);
select * from MY_TABLE where date_field=<last_page_date and not (date_field_index>=last_page_date_index and date_field=last+page_date) order by date_field DESC, date_field_index DESC limit 20;

..thus using the where clause and corresponding index instead of offset. OK, now the questions:

1) is this the best way to improve the initial query? 2) how can we populate that date_field_index field? we have to provide some trigger for this? 3) We should not use RowNumber() functions in Postgres because they are not using indexes and thus very slow. Is it correct?

2) Why column order in concatenated index is not affecting performance of the query?

My measurements show, that while searching using concatenated index (index consisting of 2 and more columns) there is no difference if we place the most selective column to the first place - or if we place it to the end. Why? If we place the most selective column to the first place - we run through a shorter range of the found rows which should have impact on performance. Am I right?

Upvotes: 1

Views: 111

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125524

Use the primary key to untie in instead of the date_field_index column. Otherwise explain why that is not an option.

order by date_field DESC, "primary_key_column(s)" DESC

The combined index with the most unique column first is the best performer, but it will not be used if:

  • the distinct values are more than a few percent of the table
  • there aren't enough rows to make it worth
  • the range of dates is not small enough

What is the output of explain my_query?

Upvotes: 1

Related Questions