Reputation: 215
I have a slow Postgres query and I may need better indexes. But I lack experience with indexing, so I just indexed foreign keys. I use Postgres 14.3 and Django 3.0.
This is the slow query. It's only slow when the chat room has a lot of messages (like 10k+).
SELECT
chat_message.info_date_str,
auth_user.username,
chat_message.id,
chat_message.text_msg,
chat_message.type_message,
chat_message.url_file,
chat_message.url_file,
chat_message.was_removed,
chat_message.was_removed
FROM chat_message
INNER JOIN user_profile_userprofile
ON chat_message.owner_id = user_profile_userprofile.id
INNER JOIN auth_user
ON user_profile_userprofile.user_id = auth_user.id
WHERE chat_message.chat_room_id = 7204712
AND NOT chat_message.delete_by::text LIKE '%fattimita08%'
ORDER BY chat_message.id DESC
LIMIT 20
Duration: 11956.422
Definitions for involved tables:
CREATE TABLE public.chat_message (
id integer NOT NULL DEFAULT nextval('chat_message_id_seq'::regclass)
, chat_room_id integer NOT NULL
, owner_id integer NOT NULL
, info_date timestamp with time zone NOT NULL
, aud_msg character varying(500) NOT NULL
, vid_msg character varying(500) NOT NULL
, img_msg character varying(500) NOT NULL
, text_msg text NOT NULL
, type_message smallint NOT NULL
, delete_by character varying(50) NOT NULL
, url_file character varying(1000) NOT NULL
, receiver_id integer
, info_date_str character varying(200) NOT NULL DEFAULT '2019-11-28 23:43:44.578419'
, was_removed boolean NOT NULL DEFAULT false
, CONSTRAINT chat_message_pkey PRIMARY KEY (id)
, CONSTRAINT chat_messag_receiver_id_0eceddde_fk_user_profile_userprofile_id FOREIGN KEY (receiver_id) REFERENCES public.user_profile_userprofile (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED
, CONSTRAINT chat_message_chat_id_fkey FOREIGN KEY (chat_room_id) REFERENCES public.chat_chat (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
, CONSTRAINT chat_message_owner_id_fkey FOREIGN KEY (owner_id) REFERENCES public.user_profile_userprofile (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX chat_room_id_idx ON public.chat_message (chat_room_id);
CREATE INDEX owner_id_idx ON public.chat_message (owner_id);
CREATE INDEX receiver_id_idx ON public.chat_message (receiver_id);
CREATE TABLE public.user_profile_userprofile (
id integer NOT NULL DEFAULT nextval('user_profile_userprofile_id_seq'::regclass)
, user_id integer NOT NULL
, birthdate date
, profile_picture character varying(500) NOT NULL
, country character varying(500) NOT NULL
, phrase character varying(300) NOT NULL
, is_mod boolean NOT NULL
, ip inet, id_cel character varying(500) NOT NULL DEFAULT ''
, twitter_key character varying(500) DEFAULT ''
, twitter_secret character varying(500) DEFAULT ''
, gender character varying(500) NOT NULL DEFAULT ''
, lastseen character varying(500) NOT NULL DEFAULT ''
, profile_video character varying(500) NOT NULL DEFAULT ''
, status character varying(500) NOT NULL DEFAULT ''
, name character varying(500) NOT NULL
, "id_timeZone" character varying(500) NOT NULL
, profile_picture_thumb character varying(500) NOT NULL
, city character varying(500) NOT NULL
, CONSTRAINT user_profile_userprofile_pkey PRIMARY KEY (id)
, CONSTRAINT user_profile_userprofile_user_id_key UNIQUE (user_id)
, CONSTRAINT cascade_user FOREIGN KEY (user_id) REFERENCES public.auth_user (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE
);
CREATE UNIQUE INDEX userprofile_user_idx ON public.user_profile_userprofile (user_id);
CREATE TABLE public.auth_user (
id integer NOT NULL DEFAULT nextval('auth_user_id_seq'::regclass)
, password character varying(128) NOT NULL
, last_login timestamp with time zone
, is_superuser boolean NOT NULL
, username character varying(150) NOT NULL
, first_name character varying(30) NOT NULL
, last_name character varying(30) NOT NULL
, email character varying(254) NOT NULL
, is_staff boolean NOT NULL
, is_active boolean NOT NULL
, date_joined timestamp with time zone NOT NULL
, CONSTRAINT auth_user_pkey PRIMARY KEY (id)
, CONSTRAINT auth_user_username_key UNIQUE (username)
);
CREATE INDEX auth_user_username_like ON public.auth_user (username varchar_pattern_ops);
CREATE INDEX username_idx ON public.auth_user (username);
I used this query to analyze:
EXPLAIN ANALYZE SELECT "chat_message"."info_date_str", "auth_user"."username", "chat_message"."id", "chat_message"."text_msg", "chat_message"."type_message", "chat_message"."url_file", "chat_message"."url_file", "chat_message"."was_removed", "chat_message"."was_removed" FROM "chat_message" INNER JOIN "user_profile_userprofile" ON ("chat_message"."owner_id" = "user_profile_userprofile"."id") INNER JOIN "auth_user" ON ("user_profile_userprofile"."user_id" = "auth_user"."id") WHERE ("chat_message"."chat_room_id" = 7204712 AND NOT ("chat_message"."delete_by"::text LIKE '%fattimita08%')) ORDER BY "chat_message"."id" DESC LIMIT 20
QUERY PLAN
Limit (cost=33503.90..33614.03 rows=20 width=87) (actual time=5753.937..5754.447 rows=20 loops=1)
-> Nested Loop (cost=33503.90..85612.86 rows=9463 width=87) (actual time=5753.935..5754.443 rows=20 loops=1)
-> Nested Loop (cost=33503.48..63940.73 rows=9463 width=80) (actual time=5753.923..5754.399 rows=20 loops=1)
-> Gather Merge (cost=33503.06..34605.18 rows=9463 width=80) (actual time=5753.893..5754.349 rows=20 loops=1)
Workers Planned: 2
Workers Launched: 0
-> Sort (cost=32503.03..32512.89 rows=3943 width=80) (actual time=5753.535..5753.540 rows=20 loops=1)
Sort Key: chat_message.id DESC
Sort Method: quicksort Memory: 2477kB
-> Parallel Bitmap Heap Scan on chat_message (cost=129.77..32267.53 rows=3943 width=80) (actual time=81.934..5741.308 rows=10447 loops=1)
Recheck Cond: (chat_room_id = 7204712)
Filter: ((delete_by)::text !~~ '%fattimita08%'::text)
Rows Removed by Filter: 1
Heap Blocks: exact=8278
-> Bitmap Index Scan on chat_room_id_idx (cost=0.00..127.41 rows=9463 width=0) (actual time=63.921..63.922 rows=10448 loops=1)
Index Cond: (chat_room_id = 7204712)
-> Memoize (cost=0.43..4.95 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=20)
Cache Key: chat_message.owner_id
Cache Mode: logical
Hits: 18 Misses: 2 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Scan using user_profile_userprofile_pkey on user_profile_userprofile (cost=0.42..4.94 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=2)
Index Cond: (id = chat_message.owner_id)
-> Index Scan using auth_user_pkey on auth_user (cost=0.42..2.29 rows=1 width=13) (actual time=0.002..0.002 rows=1 loops=20)
Index Cond: (id = user_profile_userprofile.user_id)
Planning Time: 43.104 ms
Execution Time: 5754.630 ms
Upvotes: 0
Views: 404
Reputation: 658857
The main hardship for this query is that it filters by chat_room_id
, but sorts by id
(descending) - before a very small LIMIT
.
Postgres expects the filter chat_room_id = 7204712
to be selective and starts with a bitmap heap scan on chat_room_id_idx
, finding ~ 10k rows.
The additional filter AND NOT chat_message.delete_by::text LIKE '%fattimita08%'
seems hardly selective. (BTW, the cast (::text
) in that expression does nothing useful, drop it.)
This multicolumn index should give you faster results, by orders of magnitude:
CREATE INDEX chat_message_your_name_here_idx ON public.chat_message (chat_room_id, id DESC);
Now Postgres can just walk the index with an index scan, and filter the few rows that should not qualify.
Related:
That said, the Parallel Bitmap Heap Scan on chat_message
is responsible for the lion share of the cost - a cost that seems excessive for just rows=10447
. Is that table particularly big? Do you have decent hardware? Maybe the table is bloated? Are matching rows spread out over the big table physically? Depending on the actual diagnosis, VACUUM
, or even VACUUM FULL
, or better yet CLUSTER
on the new index might help (a lot). Like:
CLUSTER public.chat_message USING chat_message_your_name_here_idx;
Be aware that's expensive and blocking for a big table. There are non-blocking alternatives. See:
Upvotes: 4