Reputation: 1601
I am trying to optimize a query which has been destroying my DB.
https://explain.depesz.com/s/isM1
If you have any insights into how to make this better please let me know.
We are using RDS/Postgres 11.9
explain analyze SELECT "src_rowdifference"."key",
"src_rowdifference"."port_id",
"src_rowdifference"."shipping_line_id",
"src_rowdifference"."container_type_id",
"src_rowdifference"."shift_id",
"src_rowdifference"."prev_availability_id",
"src_rowdifference"."new_availability_id",
"src_rowdifference"."date",
"src_rowdifference"."prev_last_update",
"src_rowdifference"."new_last_update"
FROM "src_rowdifference"
INNER JOIN "src_containertype" ON ("src_rowdifference"."container_type_id" = "src_containertype"."key")
WHERE ("src_rowdifference"."container_type_id" IN
(SELECT U0."key"
FROM "src_containertype" U0
INNER JOIN "notification_tablenotification_container_types" U1 ON (U0."key" = U1."containertype_id")
WHERE U1."tablenotification_id" = '[email protected]')
AND "src_rowdifference"."new_last_update" >= '2020-01-15T03:11:06.291947+00:00'::timestamptz
AND "src_rowdifference"."port_id" IN
(SELECT U0."key"
FROM "src_port" U0
INNER JOIN "notification_tablenotification_ports" U1 ON (U0."key" = U1."port_id")
WHERE U1."tablenotification_id" = '[email protected]')
AND "src_rowdifference"."shipping_line_id" IN
(SELECT U0."key"
FROM "src_shippingline" U0
INNER JOIN "notification_tablenotification_shipping_lines" U1 ON (U0."key" = U1."shippingline_id")
WHERE U1."tablenotification_id" = '[email protected]')
AND "src_rowdifference"."prev_last_update" IS NOT NULL
AND NOT ("src_rowdifference"."prev_availability_id" = 'na'
AND "src_rowdifference"."prev_availability_id" IS NOT NULL)
AND NOT ("src_rowdifference"."key" IN
(SELECT V1."rowdifference_id"
FROM "notification_tablenotificationtrigger_row_differences" V1
WHERE V1."tablenotificationtrigger_id" IN
(SELECT U0."id"
FROM "notification_tablenotificationtrigger" U0
WHERE U0."notification_id" = '[email protected]'))));
All my indexes are btree + btree(varchar_pattern_ops) "src_rowdifference_port_id_shipping_line_id_9b3465fc_uniq" UNIQUE CONSTRAINT, btree (port_id, shipping_line_id, container_type_id, shift_id, date, new_last_update)
Edit: A little unrelated change that I made was added some more ssd disk space to my RDS instance. That made a huge difference to the CPU usage and in turn made a huge difference to the number of connections we have.
Upvotes: 0
Views: 56
Reputation: 44237
It is hard to think about the plan as a whole, as I don't understand what it is looking for. But looking at the individual pieces, there are two which together dominate the run time.
One is the index scan on src_rowdifference_port_id_shipping_line_id_9b3465fc, which seems pretty slow given the number of rows returned. Comparing the Index Condition to the index columns, I can see that the condition on new_last_update cannot be applied efficiently in the index because two columns in the index come before it and have no equality conditions in the node. So instead that >=
is applied as an "in-index filter" where it needs to test each row and reject it, rather than just skipping it in bulk. I don't know how many rows that removes as the "Rows Removed by Filter" does not count in-index filters, but it is potentially large. So one thing to try would be to make a new index on (port_id, shipping_line_id, container_type_id, new_last_update)
. Or maybe replace that index with a reordered version (port_id, shipping_line_id, container_type_id, new_last_update, shift_id, date)
but of course that might make some other query worse.
The other time consuming thing is kicking the materialized node 47 thousand times (each one looping over up to 22 thousand rows) to implement NOT (SubPlan 1)
. That should be using a hashed subplan, rather than a linear searched subplan. The only reason I can think of that it not doing the hashed subplan is that work_mem is not large enough to anticipate fitting it into memory. What is your setting for work_mem? What happens if you bump it up to "100MB" or so?
The NOT (SubPlan 1)
from the EXPLAIN corresponds to the part of your query AND NOT ("src_rowdifference"."key" IN (...))
. If bumping up work_mem doesn't work, you could try rewriting that into a NOT EXISTS clause instead.
Upvotes: 2