dot
dot

Reputation: 15660

how to ID which indexes need to be created

Background

I am trying to find tables that could do with indexing. Programmers are complaining that certain table queries are pretty slow.

I found this query here on stackoverflow:

SELECT   schemaname, relname, seq_scan, seq_tup_read,
           idx_scan, seq_tup_read / seq_scan AS avg
FROM    pg_stat_user_tables
WHERE   seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;

I tried running on my database and sure enough, one of the tables that our programmers are complaining about appears:

 schemaname |        relname         | seq_scan | seq_tup_read | idx_scan |  avg
------------+------------------------+----------+--------------+----------+-------
 public     | widgets                |      511 |     22834898 |        3 | 44686

Based on what I read, if you have a table that has more seq-scans than idx-scans, this is a good indicator of a table that could use more indexing.

Problem

My next remedial question is how do I know which column to index?

I queried to see which indexes currently exist:

testdb=# select * from pg_indexes where tablename='widgets';
 schemaname | tablename |   indexname   | tablespace |                            indexdef
------------+-----------+---------------+------------+----------------------------------------------------------------
 public     | widgets| widgets_pkey |            | CREATE UNIQUE INDEX widgets_pkey ON widgets USING btree (id)
(1 row)

And this is what the table looks like in general:

testdb=# \d widgets
                                    Table "public.widgets"
  Column  |            Type             |                       Modifiers
----------+-----------------------------+-------------------------------------------------------
 id       | integer                     | not null default nextval('widgets_id_seq'::regclass)
 id_ts    | timestamp without time zone | default now()
 status   | character varying(32)       |
 class    | character varying(16)       |
 from     | character varying(32)       |
 to       | character varying(32)       |
 start_ts | integer                     |
 end_ts   | integer                     |
 elapsed  | integer                     |
Indexes:
    "widgets_pkey" PRIMARY KEY, btree (id)

Any suggestions would be appreciated.

Upvotes: 1

Views: 60

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246493

Use pg_stat_statements to find out which queries are causing the sequential scans.

You have to add it to shared_preload_libraries and restart PostgreSQL. Then you create the pg_stat_statements view with

CREATE EXTENSION pg_stat_statements;

Then wait until some workload has been collected and query the view. You can filter the statements using your table name (WHERE query ILIKE '%table_name%') and order to get the longest running statements first.

That should show you the queries that cause the sequential scans.

Upvotes: 1

Related Questions