Reputation: 15660
Background
I am trying to find tables that could do with indexing. Programmers are complaining that certain table queries are pretty slow.
I found this query here on stackoverflow:
SELECT schemaname, relname, seq_scan, seq_tup_read,
idx_scan, seq_tup_read / seq_scan AS avg
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;
I tried running on my database and sure enough, one of the tables that our programmers are complaining about appears:
schemaname | relname | seq_scan | seq_tup_read | idx_scan | avg
------------+------------------------+----------+--------------+----------+-------
public | widgets | 511 | 22834898 | 3 | 44686
Based on what I read, if you have a table that has more seq-scans than idx-scans, this is a good indicator of a table that could use more indexing.
Problem
My next remedial question is how do I know which column to index?
I queried to see which indexes currently exist:
testdb=# select * from pg_indexes where tablename='widgets';
schemaname | tablename | indexname | tablespace | indexdef
------------+-----------+---------------+------------+----------------------------------------------------------------
public | widgets| widgets_pkey | | CREATE UNIQUE INDEX widgets_pkey ON widgets USING btree (id)
(1 row)
And this is what the table looks like in general:
testdb=# \d widgets
Table "public.widgets"
Column | Type | Modifiers
----------+-----------------------------+-------------------------------------------------------
id | integer | not null default nextval('widgets_id_seq'::regclass)
id_ts | timestamp without time zone | default now()
status | character varying(32) |
class | character varying(16) |
from | character varying(32) |
to | character varying(32) |
start_ts | integer |
end_ts | integer |
elapsed | integer |
Indexes:
"widgets_pkey" PRIMARY KEY, btree (id)
Any suggestions would be appreciated.
Upvotes: 1
Views: 60
Reputation: 246493
Use pg_stat_statements
to find out which queries are causing the sequential scans.
You have to add it to shared_preload_libraries
and restart PostgreSQL. Then you create the pg_stat_statements
view with
CREATE EXTENSION pg_stat_statements;
Then wait until some workload has been collected and query the view. You can filter the statements using your table name (WHERE query ILIKE '%table_name%'
) and order to get the longest running statements first.
That should show you the queries that cause the sequential scans.
Upvotes: 1