Reputation: 11
I am trying to write some sql query for university. It works, but it takes a lot of time to be executed ( around 6 minutes for 30k rows). Here is the query:
Select
SalesNumber,
CustomerID,
od.ProductID,
od.PricePerUnit,
temp.Discount,
OrderAmount,
(OrderAmount * PricePerUnit ) - Discount as Total,
((OrderAmount * PricePerUnit ) - Discount) - (OrderAmount * StandardCost) as Profit,
case
when CountryRegion = "United States" then ((OrderAmount * PricePerUnit ) - Discount) * 4
when CountryRegion = "Canada" then ((OrderAmount * PricePerUnit ) - Discount) * 11
when CountryRegion = "United Kingdom" then ((OrderAmount * PricePerUnit ) - Discount) * 5
when CountryRegion = "Australia" then ((OrderAmount * PricePerUnit ) - Discount) * 7
end as TaxAmount,
CASE WHEN temp.HasAdditionalCosts <> 0
THEN case
when sm.`name` = "Standard Ground" then (OrderAmount * 3) + 6
when sm.`name` = "Cargo International" then (OrderAmount * 5) + 9
when sm.`name` = "Oversea Deluxe" then (OrderAmount * 2) + 11
end
ELSE case
when sm.`name` = "Standard Ground" then (OrderAmount * 3)
when sm.`name` = "Cargo International" then (OrderAmount * 5)
when sm.`name` = "Oversea Deluxe" then (OrderAmount * 2)
end
END as OrderLineFreightCost,
oh.`Status`,
oh.OrderDate,
oh.DueDate,
oh.ShipDate,
CASE WHEN oh.DueDate < oh.ShipDate
THEN 1
ELSE 0
End as IsLateShipment
FROM db_relational.Sales as oh
INNER JOIN db_relational.SalesDetail as od ON oh.SalesID = od.SalesID
INNER JOIN db_relational.Shipping as sm ON oh.ShippingID = sm.ShippingID
INNER JOIN db_relational.Location as ad ON ad.LocationID = oh.ShipToLocationID
INNER JOIN (SELECT tempOD.SalesDetailID,
case
when p.ProductCategory = "Clothing" and OrderAmount >= 5 then (OrderAmount * PricePerUnit * 0.08)
when p.ProductCategory = "Clothing" and OrderAmount >= 10 then (OrderAmount * PricePerUnit * 0.2)
when p.ProductCategory = "Accessories" and OrderAmount >= 5 then (OrderAmount * PricePerUnit * 0.06)
when p.ProductCategory = "Accessories" and OrderAmount >= 10 then (OrderAmount * PricePerUnit * 0.15)
else 0
end as Discount,
p.HasAdditionalCosts,
p.StandardCost
From db_relational.SalesDetail as tempOD
INNER JOIN db_transformed.dm_product as p ON tempOD.ProductID = p.ProductID
) AS temp ON temp.SalesDetailID = od.SalesDetailID
Could someone tell me is there any way to improve performance. Am I using too many subselects and switch cases ?
Thanks!
Execution Plan: Image
Upvotes: 1
Views: 45
Reputation: 484
Normally, Switch cases wouldn't affect the performance of the query too much. But, please becareful when using Sub Select & Inner Join, unnecessary usage will increase the data traversal to make the query slower and heavier.
I found that following line is redundant:-
INNER JOIN db_relational.SalesDetail as od
You should remove the line and utilize the sub query as below:-
Select
SalesNumber,
CustomerID,
temp.ProductID,
temp.PricePerUnit,
temp.Discount,
OrderAmount,
(OrderAmount * PricePerUnit ) - Discount as Total,
((OrderAmount * PricePerUnit ) - Discount) - (OrderAmount * StandardCost) as Profit,
case
when CountryRegion = "United States" then ((OrderAmount * PricePerUnit ) - Discount) * 4
when CountryRegion = "Canada" then ((OrderAmount * PricePerUnit ) - Discount) * 11
when CountryRegion = "United Kingdom" then ((OrderAmount * PricePerUnit ) - Discount) * 5
when CountryRegion = "Australia" then ((OrderAmount * PricePerUnit ) - Discount) * 7
end as TaxAmount,
CASE WHEN temp.HasAdditionalCosts <> 0
THEN case
when sm.`name` = "Standard Ground" then (OrderAmount * 3) + 6
when sm.`name` = "Cargo International" then (OrderAmount * 5) + 9
when sm.`name` = "Oversea Deluxe" then (OrderAmount * 2) + 11
end
ELSE case
when sm.`name` = "Standard Ground" then (OrderAmount * 3)
when sm.`name` = "Cargo International" then (OrderAmount * 5)
when sm.`name` = "Oversea Deluxe" then (OrderAmount * 2)
end
END as OrderLineFreightCost,
oh.`Status`,
oh.OrderDate,
oh.DueDate,
oh.ShipDate,
CASE WHEN oh.DueDate < oh.ShipDate
THEN 1
ELSE 0
End as IsLateShipment
FROM db_relational.Sales as oh
INNER JOIN db_relational.Shipping as sm ON oh.ShippingID = sm.ShippingID
INNER JOIN db_relational.Location as ad ON ad.LocationID = oh.ShipToLocationID
INNER JOIN (SELECT tempOD.SalesDetailID,
case
when p.ProductCategory = "Clothing" and OrderAmount >= 5 then (OrderAmount * PricePerUnit * 0.08)
when p.ProductCategory = "Clothing" and OrderAmount >= 10 then (OrderAmount * PricePerUnit * 0.2)
when p.ProductCategory = "Accessories" and OrderAmount >= 5 then (OrderAmount * PricePerUnit * 0.06)
when p.ProductCategory = "Accessories" and OrderAmount >= 10 then (OrderAmount * PricePerUnit * 0.15)
else 0
end as Discount,
p.HasAdditionalCosts,
p.StandardCost,
tempOD.SalesID,
tempOD.SalesDetailID,
tempOD.ProductID,
tempOD.PricePerUnit,
From db_relational.SalesDetail as tempOD
INNER JOIN db_transformed.dm_product as p ON tempOD.ProductID = p.ProductID
) AS temp ON temp.SalesID = oh.SalesID
Upvotes: 1
Reputation: 331
The main problem seems to be your temp
table.
You are essentially creating a new table by joining tempOD
and p
, and then asking MySQL to join it with everything else. The caveat is that you're most likely losing any indexes in the process.
A better way to approach the problem would be to have all the joins in a subquery, and then process the data (with CASE
statements or any other transformations you need) in one or more outer queries. It would result in something like this:
SELECT PROCESS_DATA(raw_data_1), raw_data_2, ...
FROM (
SELECT raw_data_1, raw_data_2, ...
FROM table_1
JOIN table_2 ON ...
JOIN table_3 ON ...
...
JOIN table_n ON ...
) tables
Upvotes: 0