Reputation: 41040

SQL Optimization

How to optimize this?

SELECT e.attr_id, e.sku, a.value

FROM product_attr AS e, product_attr_text AS a

WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT value
    FROM product_attr_text 
    WHERE attribute_id = (
        SELECT attribute_id
        FROM eav_attr 
        WHERE attribute_code = 'similar_prod_id' 
    ) 
    AND value != ''

    GROUP BY value
    HAVING (COUNT( value ) > 1 )
)

Upvotes: 2

Answers (7)

Jonathan Leffler

Reputation: 753595

Caution: Be wary of designs that use EAV (entity, attribute, value) table designs. They tend to become very difficult to handle, for reasons of referential integrity and query complexity.

Optimize by using:

Joins instead of sub-selects.
Join notation.
Explicit table aliases on all column names.

Phase 1:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e JOIN product_attr_text AS a
       ON e.attr_id = a.attr_id
 WHERE a.value IN (
          SELECT p.value
            FROM product_attr_text AS p
            JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
           WHERE v.attribute_code = 'similar_prod_id'
             AND p.value != ''
           GROUP BY value
          HAVING (COUNT( value ) > 1)
          )

Phase 2:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e
  JOIN product_attr_text AS a ON e.attr_id = a.attr_id
  JOIN (SELECT p.value
          FROM product_attr_text AS p
          JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
         WHERE v.attribute_code = 'similar_prod_id'
           AND p.value != ''
         GROUP BY value
        HAVING (COUNT( value ) > 1)
       ) AS x ON x.value = a.value

Upvotes: 1

Quassnoi

Reputation: 425321

SELECT  e.attr_id, e.sku, a.value
FROM    (
        SELECT  pat.value
        FROM    eav_attr ea
        JOIN    product_attr_text pat
        ON      pat.attribute_id = ea.attribute_id
        WHERE   ea.attribute_code = 'similar_prod_id'
                AND value <> ''
        GROUP BY
                value
        HAVING  COUNT(*) > 1
        ) q
JOIN    product_attr_text AS a
ON      a.value = q.value
JOIN    product_attr AS e
ON      e.attr_id  = a.attr_id

Create indexes:

 eav_attr (attribute_code)
 product_attr_text (attribute_id, value)
 product_attr_text (value)
 product_attr (attr_id)

Upvotes: 2

remi bourgarel

Reputation: 9389

frederik answer is the right one, but i'd like to suggest a few point :

avoid IN, use EXISTS instead
avoid = (SELECT something from sometable), use exists instead
useless group by
use inner join instead of multiple table in from clause

but again, it depends on your machine/dbms/version etc... so to get the best performance you'll have to compare the different execution plan

Upvotes: 1

Jon Black

Reputation: 16559

use a derived table for the group by having... and join that back on whatever table(s) you need to pad out the result

Upvotes: 0

Eran Galperin

Reputation: 86805

Change it into a JOIN. MySQL doesn't optimize well subqueries in an IN() clause - it is recalculated per row (very inefficient for many rows)

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e
INNER JOIN product_attr_text AS a ON e.attr_id = a.attr_id
INNER JOIN (SELECT value
    FROM product_attr_text 
    INNER JOIN eav_attr ON eav_attr.attribute_id=product_attr_text.attribute_id
    WHERE attribute_code = 'similar_prod_id'  
    AND value != ''
    GROUP BY value
    HAVING COUNT( value ) > 1 
) AS filter ON filter.value=a.value

After you've converted the query (you may need to make some corrections according to your schema), run EXPLAIN on the query and index accordingly.

Upvotes: 1

Cătălin Pitiș

Reputation: 14341

Change the second inner select into a join:

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e, product_attr_text AS a
WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT at.value
    FROM product_attr_text at, eav_attr eat 
    WHERE at.attribute_id = eat.attribute_id 
          AND eat.attribute_code = 'similar_prod_id' 
          AND value != ''
    GROUP BY value
    HAVING (COUNT( value ) > 1 )
)

Then, take a look to the execution plan to see how the query is solved. You might need additional optimizations based on that execution plan (e.g. make sure the attribute_id field is indexed).

Upvotes: 1

Frederik Gheysels

Reputation: 56934

Difficult to answer ...

The only thing I can say is:

look at the execution plan
use IO statistics-

and take a look what part of the statement is problematic, and why. Find out if you can improve performance by adding an index, etc...

Next to that, I don't understand why you're using a group by clause ...

And, why do you use subqueries, instead of joining tables ? (Although I don't think that it will make much difference, since the SQL Server optimizer should be smart enough to generate the same execution plan).

Upvotes: 1

SQL Optimization

Answers (7)

Phase 1:

Phase 2:

Related Questions