user3048109
user3048109

Reputation: 217

Any possible way to rewrite this query for best performance?

Below query is taking long time frequently logging to slow log.

Is there any possible way to rewrite the below query, i mean better than current query.

select 
    p . *,
    pt.pretty_name,
    pt.seo_name,
    pt.description,
    pt.short_description,
    pt.short_description_2
from
    cat_products p,
    cat_product_catalog_map pcm,
    cat_current_product_prices cpp,
    cat_product_text pt
where
    pcm.catalog_id = 2
        and pcm.product_id = p.product_id
        and p.owner_catalog_id = 2
        and cpp.product_id = p.product_id
        and cpp.currency = 'GBP'
        and cpp.catalog_id = 2
        and cpp.state <> 'unavail'
        and pt.product_id = p.product_id
        and pt.language_id = 'EN'
        and p.product_id not in (select distinct
            product_id
        from
            cat_product_detail_map
        where
            short_value in ('ft_section' , 'ft_product'))
order by pt.pretty_name
limit 200 , 200;

Upvotes: 0

Views: 91

Answers (3)

Thorsten Kettner
Thorsten Kettner

Reputation: 94939

To start with: Your statement is not as readable as it should be. As you don't use the ANSI join syntax it is some work to see how the tables are related. My first guess was that you link table to table to get from cat_products to cat_product_text. That turned out wrong.

What you do is select fields from cat_products and cat_product_text only. So why join with the other two tables at all? Are you trying to see if records in these tables exist? Then use the EXISTS clause for that.

When joining tables on the catalogue id, then show the dbms that this links your tables (pcm.catalog_id = p.owner_catalog_id) rhather than making it seem they are not related (pcm.catalog_id = 2 , p.owner_catalog_id = 2).

You should not have to use DISTINCT for an IN query. The dbms should decide by itself if it is of advantage to remove duplicates or not in that set.

The following query doesn't necessarily produce the same reasult as your original query. It depends on the purpose of joining the tables cat_product_catalog_map and cat_current_product_prices. I assume, it is just a check for existence. So the query can be re-written as follows. It should be faster, because you don't have to join tables that don't add fields to the result. But even this depends on table sizes etc.

SELECT 
  p.*,
  pt.pretty_name,
    pt.seo_name,
    pt.description,
    pt.short_description,
    pt.short_description_2
FROM cat_products p
JOIN cat_product_text pt ON pt.product_id = p.product_id and pt.language_id = 'EN'
WHERE p.owner_catalog_id = 2 
AND p.product_id NOT IN 
(
  SELECT product_id
  FROM cat_product_detail_map
  WHERE short_value in ('ft_section','ft_product')
)
AND EXISTS
(
  SELECT *
  FROM cat_product_catalog_map pcm 
  WHERE pcm.product_id = p.product_id
  AND pcm.catalog_id = p.owner_catalog_id
)
AND EXISTS
(
  SELECT *
  FROM cat_current_product_prices cpp 
  WHERE cpp.product_id = p.product_id 
  AND cpp.catalog_id = p.owner_catalog_id
  AND cpp.currency = 'GBP'
  AND cpp.state <> 'unavail'
)
ORDER BY pt.pretty_name LIMIT 200,200;

Upvotes: 0

GarethD
GarethD

Reputation: 69769

Firstly I would switch to ANSI 92 explicit join syntax rather than the ANSI 89 implicit join syntax you are using, as the name suggests this is over 20 years out of date:

select  ...
from    cat_products p
        INNER JOIN cat_product_catalog_map pcm
            ON pcm.product_id=p.product_id  
        INNER JOIN cat_current_product_prices cpp
            ON cpp.product_id = p.product_id 
        INNER JOIN cat_product_text pt 
            ON pt.product_id=p.product_id 
WHERE ....

This won't affect performance but will make your query more legible, and less prone to accidental cross joins. Aaron Bertrand has written a good article on the reasons to switch that is worth a read (it is aimed at SQL Server but many of the principles are universal). Then I would remove the NOT IN (Subquery) MySQL does not optimise subqueries like this well. It will rewrite it to:

AND NOT EXISTS (SELECT 1
                FROM    cat_product_detail_map 
                WHERE   short_value in ('ft_section','ft_product')
                AND     cat_product_detail_map.product_id = p.product_id
                )

It will then execute this subquery once for every row. The inverse of this scenario (WHERE <expression> IN (Subquery) is described in the article Optimizing Subqueries with EXISTS Strategy)

You can exclude these product_ids using the LEFT JOIN/IS NULL method which performs better in MySQL as it avoids a subquery completely:

SELECT  ...
FROM    cat_products p
        LEFT JOIN cat_product_detail_map exc
            ON exc.product_id = p.product_id
            AND exc.short_value in ('ft_section','ft_product')
WHERE   exc.product_id IS NULL

This allows for better use of indexes and means that you don't have to execute a subquery for every row in the outer query.

So your full query would then be:

SELECT  p.*,
        pt.pretty_name, 
        pt.seo_name, 
        pt.description, 
        pt.short_description, 
        pt.short_description_2 
FROM    cat_products p
        INNER JOIN cat_product_catalog_map pcm
            ON pcm.product_id = p.product_id  
        INNER JOIN cat_current_product_prices cpp
            ON cpp.product_id = p.product_id 
        INNER JOIN cat_product_text pt 
            ON pt.product_id = p.product_id 
        LEFT JOIN cat_product_detail_map exc
            ON exc.product_id = p.product_id
            AND exc.short_value in ('ft_section','ft_product')
WHERE   exc.product_id IS NULL
AND     pcm.catalog_id = 2 
AND     p.owner_catalog_id = 2  
AND     cpp.currency = 'GBP' 
AND     cpp.catalog_id = 2  
AND     cpp.state <> 'unavail'  
AND     pt.language_id = 'EN' 
ORDER BY pt.pretty_name limit 200,200;

The final thing to look at would be the indexes on your tables, I don't know what you already have but I'd suggest an index on product_id on each of your tables as a bare minimum, and perhaps on the columns you are filtering on.

Upvotes: 2

Alberto Solano
Alberto Solano

Reputation: 8227

The performance of a query depends on these factors:

  • Indexes.
  • The structure of the query itself.

For the first case, you said that there are indexes in the proper columns, but make sure the product_id column is one of them.

About the query, you can use the JOIN operator to simplify it:

select 
    p.*,
    pt.pretty_name,
    pt.seo_name,
    pt.description,
    pt.short_description,
    pt.short_description_2
from
    cat_products p
    join cat_product_catalog_map pcm on p.product_id = pcm.product_id
    join cat_current_product_prices cpp on p.product_id = cpp.product_id
    join cat_product_text pt on p.product_id = pt.product_id
    join (select distinct product_id from cat_product_detail_map
    where short_value NOT in ('ft_section' , 'ft_product') otherProdId on otherProdId.product_id = p.product_id
where
    pcm.catalog_id = 2
        and p.owner_catalog_id = 2
        and cpp.currency = 'GBP'
        and cpp.catalog_id = 2
        and cpp.state <> 'unavail'
        and pt.language_id = 'EN'
order by pt.pretty_name
limit 200,200;

Upvotes: 0

Related Questions