Reputation:
I would like to optimize a SQL statement, below is the original one.
SELECT DISTINCT
p.productID,
p.filename,
p.model,
p.code,
p.manufacturerID,
f2.manufacturerName,
m.manufacturerName,
CONCAT('INSERT INTO distribItems(productID, manufacturer, code, distributorText) VALUES (',
CAST(p.productID AS CHAR),
', \'',
f2.manufacturerName,
'\', \'',
f2.code,
'\', \'',
f2.denumire,
'\') ;') INS
FROM
(SELECT
f.manufacturerName, f.categoryName, f.code, f.denumire, f.code_2
FROM
furnizorlist f
LEFT JOIN distribitems d ON
(d.manufacturer = f.manufacturerName
AND (d.code = f.code OR d.manufacturer LIKE 'DELL')
AND d.distributorText = LEFT(f.denumire, 450))
WHERE
productID IS NULL) f2,
products p,
manufacturers m
WHERE
f2.code_2 <> ''
AND (f2.code_2 = p.code_2 OR f2.code_2 = p.model_2)
AND p.manufacturerID = m.manufacturerID
AND m.manufacturerName = f2.manufacturerName
AND m.manufacturerName != 'Compatibil'
AND p.code != '1'
ORDER by p.filename ASC;
On my PC it takes around 34 seconds.
My idea was to write the subquery as a Join and set the conditions in the Where clause.
Here is my incredible fast SQL:
SELECT DISTINCT
p.productID,
p.filename,
p.model,
p.code,
p.manufacturerID,
f.manufacturerName,
m.manufacturerName,
CONCAT('INSERT INTO distribItems(productID, manufacturer, code, distributorText) VALUES (',
CAST(p.productID AS CHAR),
', \'',
f.manufacturerName,
'\', \'',
f.code,
'\', \'',
f.denumire,
'\') ;') INS
FROM
furnizorlist f,
distribitems d,
#subquery end
products p,
manufacturers m
WHERE
d.manufacturer = f.manufacturerName
AND (d.code = f.code OR d.manufacturer LIKE 'DELL')
AND d.distributorText = LEFT(f.denumire, 450)
AND d.productID IS NULL
#subquery condions end (f and d tables)
# the next is a subquery result:
AND f.code_2 <> ''
AND (f.code_2 = p.code_2 OR f.code_2 = p.model_2)
AND p.manufacturerID = m.manufacturerID
AND m.manufacturerName = f.manufacturerName
AND m.manufacturerName != 'Compatibil'
AND p.code != '1'
ORDER by p.filename ASC;
If I write to explain improved_sql
I will see an Impossible WHERE column.
I tried but couldn't find why it is impossible. I checked the fields' compatibility: to not have a case where it require to compare an int with a varchar and like that. I couldn't find any major error, that's why I am here.
Is there a logical error in the WHERE clause? like asking a field to be 1
and later to be 2
?
Fields which ends with ID are INT Fields which ends with NAME, code are varchar (255) Fields with Text are 450 and 8192 varchar ( in only one place is used)
traslation: "denumire" means "description" - or something like that :)
Not sure which version of mysql is running server side, probably 5.
I can suggest table structure changes if I have a good reason, probably the code will be adjusted.
EDIT:
The impossible where it comes from here:
EXPLAIN
SELECT
f.manufacturerName,
f.categoryName,
f.code,
f.denumire,
f.code_2
FROM
furnizorlist f
INNER JOIN
distribitems d ON (d.manufacturer = f.manufacturerName
AND (d.code = f.code
OR d.manufacturer LIKE 'DELL')
AND d.distributorText = LEFT(f.denumire, 450))
WHERE
productID IS NULL
Remark: INNER JOIN , not LEFT JOIN.
Edit2: Table: furnizorlist 42,751 records Table: distribitems 72,290 records
Upvotes: 4
Views: 218
Reputation: 5449
Not entirely sure but here's a few points that many help, if you're going to use an inner table (you're first example) - you really have to try and narrow down what it's pulling back. This may mean writing some redundant where clauses. For example, "where order_dt is between 2001 and 2002" might end up in the smaller inner table, AND the main query. Why? Because the inner table can't be optimized in some situations - resulting in million of rows the server retrieves temporarily, for just the few you need.
Also, I noticed a string function in one of your joins "LEFT(f.denumire, 450))"
Any kind of function should be avoided during a join, this forces the server to (one by one) evaluate every single record.. it cannot optimize. This is somewhat along the same lines of why you should always use primary keys for a join, but more time consuming.. best stick with "like" = AND, NOT, OR, IN.. etc.
Upvotes: 1