TheStoneFox
TheStoneFox

Reputation: 3067

postgres query with high cpu taking ages to run

I have a web application that runs off PHP and a postgres backend (9.1).

Most of the heavy DB lifting work is done via postgres stored procedures.

One of the processes in the app is an import data routine. The stored procedure is quite intensive on import but whilst developing it can import my test sheet (about 20 lines of data) in about 15 seconds.

This is done on a 4 core ubuntu VM on my local desktop with the default postgres config (with 1GB ram assigned to VM). My CPU is an Intel i7.

I've used pg_top on the local machine and the SELECT process spikes at about 60% CPU usage and then finishes within under 15 seconds.

So, I've now deployed the app to a live environment, which is a 1and1 professional server. 32 Cores, 64GB ram, 2TB hard drive. Very expensive and very big numbers!

Now, running the same import routine on the live server takes over 6 minutes, with the postgres SELECT statement taking about 6 minutes running at 100% CPU.

I've been through many of the postgres conf settings and upped the memory numbers to match the higher powered box, but no matter what I change it doesn't seem to have any affect on the extremely poor performance.

Does anyone have any idea of why the query would perform so much worse?

It's only 15 seconds on a 4 core VM with 1GB ram

but 6 minutes on a 32 core dedicated server with 64GB ram

Something is clearly screwed up along the lines but I cant figure out what it is :(

Edit:

Ok this is the query that I think i've pinpointed the issue to (takes 50/60ms on the large dataset rather than 10ms on the small dataset)

EXPLAIN UPDATE artwork_entity SET "updated_on"=NOW(), "category"='blah', "category:oid"=47425 
WHERE artwork_entity."id" IN (
SELECT n."id" FROM (
SELECT e."id" FROM artwork_entity e
WHERE e."id"=47425 OR e."id" IN
(SELECT l."descendant_id" FROM artwork_relation l
LEFT JOIN artwork_entity e1 ON l."descendant_id"=e1."id"
WHERE l."depth">0 AND l."ancestor_id"=47425
AND (e1."category:oid"=(SELECT e2."category:oid" FROM artwork_entity e2 WHERE e2."id"=l."ancestor_id") OR e1."category:oid" IS NULL))
) AS n);




Update on artwork_entity  (cost=3864.35..7743.46 rows=21118 width=451)"
  ->  Hash Semi Join  (cost=3864.35..7743.46 rows=21118 width=451)"
        Hash Cond: (artwork_entity.id = e.id)"
        ->  Seq Scan on artwork_entity  (cost=0.00..3364.36 rows=42236 width=445)"
        ->  Hash  (cost=3600.38..3600.38 rows=21118 width=10)"
              ->  Seq Scan on artwork_entity e  (cost=24.84..3600.38 rows=21118 width=10)"
                    Filter: ((id = 47425) OR (hashed SubPlan 2))"
                    SubPlan 2"
                      ->  Nested Loop Left Join  (cost=0.00..24.83 rows=1 width=4)"
                            Filter: ((e1."category:oid" = (SubPlan 1)) OR (e1."category:oid" IS NULL))"
                            ->  Index Scan using artwork_relation_ancestor_id_descendant_id_key on artwork_relation l  (cost=0.00..8.28 rows=1 width=8)"
                                  Index Cond: (ancestor_id = 47425)"
                                  Filter: (depth > 0)"
                            ->  Index Scan using artwork_entity_pkey on artwork_entity e1  (cost=0.00..8.27 rows=1 width=8)"
                                  Index Cond: (l.descendant_id = id)"
                            SubPlan 1"
                              ->  Index Scan using artwork_entity_pkey on artwork_entity e2  (cost=0.00..8.27 rows=1 width=4)"
                                    Index Cond: (id = l.ancestor_id)"

Also, this query was performed with no indexes added to any of the columns.

Also, just to note the inner select statement only takes about 10/20ms to run on the large dataset (so it must be the update?) It's only updating 2 rows out of the massive amount of rows available.

Edit 2:

EXPLAIN SELECT e."id" FROM artwork_entity e
WHERE e."id"=47425 OR e."id" IN
(
SELECT l."descendant_id" FROM artwork_relation l
LEFT JOIN artwork_entity e1 ON l."descendant_id"=e1."id"
WHERE l."depth">0 AND l."ancestor_id"=47425
AND (e1."category:oid"=(SELECT e2."category:oid" FROM artwork_entity e2 WHERE e2."id"=l."ancestor_id") OR e1."category:oid" IS NULL)
)

Then it tries to get 21k rows for the sequence scan

But if I break that up into two separate queries like so:

EXPLAIN SELECT e."id" FROM artwork_entity e
WHERE e."id"=47425

This only gets 1 row then the other part of the query

EXPLAIN SELECT l."descendant_id" FROM artwork_relation l
LEFT JOIN artwork_entity e1 ON l."descendant_id"=e1."id"
WHERE l."depth">0 AND l."ancestor_id"=47425
AND (e1."category:oid"=(SELECT e2."category:oid" FROM artwork_entity e2 WHERE e2."id"=l."ancestor_id") OR e1."category:oid" IS NULL)

Also only gets 1 row, but if that second query is part of the in then it tries to get all 21k rows.

How come?

Edit 3:

Simplified the statement that returns 21k rows in initial scan down to:

EXPLAIN SELECT e."id" FROM artwork_entity e
WHERE e."id"=47425 OR e."id" IN
(
SELECT l."descendant_id" FROM artwork_relation l
WHERE l."depth">0 AND l."ancestor_id"=47425
)

Run separately they both return a single row, but added together it queries the entire data set.

Upvotes: 0

Views: 4625

Answers (1)

TheStoneFox
TheStoneFox

Reputation: 3067

Ok I've figured out a quicker way of doing it:

Rather than doing this

WHERE e."id"=47425 OR e."id" IN (...)

I can drop the OR statement and just do the IN statement of:

SELECT e."id" FROM artwork_entity e
WHERE e."id" IN
(
SELECT l."descendant_id" FROM artwork_relation l
LEFT JOIN artwork_entity e1 ON l."descendant_id"=e1."id"
WHERE l."depth">=0 AND l."ancestor_id"=47425
AND (e1."category:oid"=(SELECT e2."category:oid" FROM artwork_entity e2 WHERE e2."id"=l."ancestor_id") OR e1."category:oid" IS NULL)
)

The difference is now the IN statement is depth>=0 and not depth>0. The reason is I actually store a self referencing relation to the entity at a depth of 0. I think I added this way after I wrote this stored procedure so at the time it wasn't available.

Anyway, in doing that it only searches for the correct rows and the result is a much faster query (12ms not 60ms)

This answer is probably not useful to anyone else though!

Edit:

I did say this was my solution, and on the whole it is much better.

However, now on the live server it takes 50 seconds to do the import (rather than 6 minutes) but it's still much faster on my local VM (12 seconds with the same dataset)

postgres doesn't hit above 30% CPU (on live or local server) but for some reason it's way quicker on my low powered local VM.

Is there anything I'm missing or should be aware of config wise?

Upvotes: 1

Related Questions