Reputation: 2474
When I check SHOW PROCESSLIST;
in database I got below query. It heavily uses CPU (more than 100%), it took 80 seconds to complete the query. We have a separate server for database(64GB RAM).
INSERT INTO `search_tmp_598075de5c7e67_73335919`
SELECT `main_select`.`entity_id`, MAX(score) AS `relevance`
FROM (SELECT `search_index`.`entity_id`, (((0)) * 1) AS score
FROM `catalogsearch_fulltext_scope1` AS `search_index`
LEFT JOIN `catalog_eav_attribute` AS `cea`
ON search_index.attribute_id = cea.attribute_id
LEFT JOIN `catalog_category_product_index` AS `category_ids_index`
ON search_index.entity_id = category_ids_index.product_id
LEFT JOIN `review_entity_summary` AS `rating`
ON `rating`.`entity_pk_value`=`search_index`.entity_id
AND `rating`.entity_type = 1
AND `rating`.store_id = 1
WHERE (category_ids_index.category_id = 2299)
) AS `main_select`
GROUP BY `entity_id`
ORDER BY `relevance` DESC
LIMIT 10000
why does this query use my full CPU resources?
Upvotes: 1
Views: 455
Reputation: 350335
Some inefficiencies:
There is a non-null condition on the records of the outer joined catalog_category_product_index
. This turns the outer join into an inner join. It will be more efficient to use an inner join
clause.
There is no need to have a nested query: the grouping, ordering and limiting can be done directly on the inner query.
(((0)) * 1)
is just a complex way of saying 0
, and taking the MAX
of that will obviously still return a relevance of 0
for all records. Not only is this an inefficient way to output 0
, it also makes no sense. I assume your real query has some less evident calculation there, which might need optimisation.
If catalog_eav_attribute.attribute_id
is a unique field, then there is no sense in outer joining that table, because that data is not used anywhere
If review_entity_summary.entity_pk_value
is unique (at least when entity_type = 1
and store_id = 1
), then again there is no use in outer joining that table, because that data is not used anywhere
If the fields in the above 2 bullet points are non-unique, but the number of records returned per search_index.entity_id
value is not influencing the result (as it currently stands with the obscure (((0)) * 1)
value, it does not), then neither outer join is needed either.
With these assumptions, the select
part can be reduced to:
SELECT search_index.entity_id,
MAX(((0)) * 1) AS relevance
FROM catalogsearch_fulltext_scope1 AS search_index
INNER JOIN catalog_category_product_index AS category_ids_index
ON search_index.entity_id = category_ids_index.product_id
WHERE category_ids_index.category_id = 2299
GROUP BY search_index.entity_id
ORDER BY relevance DESC
LIMIT 10000
I still left the (((0)) * 1)
in there, but it really makes no sense.
Upvotes: 3