steve.g
steve.g

Reputation: 115

Maximizing query efficiency

I have a query that takes about 3 minutes to run. The Query is being used for reporting and i would like it to be more efficient. I think the bottle neck is the inner joins, I could be wrong just want to know if you guys had any solutions.

    SELECT DISTINCT 
    SUBSTRING(T1.DateTime, 1, 4) AS Year, 
    SUBSTRING(T1.DateTime, 5, 2) AS Month, 
    SUBSTRING(T1.DateTime, 7, 2) AS Day, 
    T1.PipeNr, T1.SalesOrder, T1.JobNr, SIST.DefectCode AS DefectRef, 
    DEFCODES.DefectCode, DEFCODES.DefectDesc, SIST.ODYes, SIST.LocationWrap, 
    T1.OWS0601, T1.OWS0602, T1.OWS0603, T1.CrossWeld

    FROM PIPEDB.dbo.SIST INNER JOIN PIPEDB.dbo.MPIPEID T1 ON PIPEDB.dbo.SIST.PipeNr = T1.PipeNr INNER JOIN 
    PIPEDB.dbo.DEFCODES ON PIPEDB.dbo.SIST.DefectCode = PIPEDB.dbo.DEFCODES.DefectRef

    WHERE PIPEDB.dbo.SIST.DefectCode
    IN (
    SELECT Top (10) PIPEDB.dbo.SIST.DefectCode

    FROM PIPEDB.dbo.SIST INNER JOIN PIPEDB.dbo.MPIPEID T2 ON PIPEDB.dbo.SIST.PipeNr = T2.PipeNr INNER JOIN 
    PIPEDB.dbo.DEFCODES ON PIPEDB.dbo.SIST.DefectCode = PIPEDB.dbo.DEFCODES.DefectRef

    WHERE SUBSTRING(T2.DateTime, 1, 4) = SUBSTRING(T1.DateTime, 1, 4) AND SUBSTRING(T2.DateTime, 5, 2) = SUBSTRING(T1.DateTime, 5, 2) AND 
    SUBSTRING(T2.DateTime, 7, 2) = SUBSTRING(T1.DateTime, 7, 2)

    GROUP BY PIPEDB.dbo.SIST.DefectCode

    ORDER BY COUNT(PIPEDB.dbo.SIST.PipeNr) DESC) 

    AND (PIPEDB.dbo.DEFCODES.DefectDesc IN ("Cut To Remove Defect")) AND 
    ((CASE WHEN T1.CrossWeld = 1 THEN 1 WHEN T1.CrossWeld = 0 THEN 2 END) = @Crossweld OR @Crossweld = 0)

Upvotes: 0

Views: 129

Answers (3)

whytheq
whytheq

Reputation: 35557

Do you have to have DISTINCT ? I'd rather use GROUP BY to return a unique set of records e.g

SELECT DISTINCT X
FROM Y

change to

SELECT X
FROM Y
GROUP BY X

Also I prefer using EXISTS to using IN in my WHERE clauses as they are generally marginally faster.

I'd be tempted to put the following section in a CTE and use a partition function RANK() OVER instead of the TOP(10)/ORDER BY set up

IN (
SELECT Top (10) PIPEDB.dbo.SIST.DefectCode

FROM PIPEDB.dbo.SIST INNER JOIN PIPEDB.dbo.MPIPEID T2 ON PIPEDB.dbo.SIST.PipeNr = T2.PipeNr INNER JOIN 
PIPEDB.dbo.DEFCODES ON PIPEDB.dbo.SIST.DefectCode = PIPEDB.dbo.DEFCODES.DefectRef

WHERE SUBSTRING(T2.DateTime, 1, 4) = SUBSTRING(T1.DateTime, 1, 4) AND SUBSTRING(T2.DateTime, 5, 2) = SUBSTRING(T1.DateTime, 5, 2) AND 
SUBSTRING(T2.DateTime, 7, 2) = SUBSTRING(T1.DateTime, 7, 2)

GROUP BY PIPEDB.dbo.SIST.DefectCode

ORDER BY COUNT(PIPEDB.dbo.SIST.PipeNr) DESC)

Upvotes: 0

Gulli Meel
Gulli Meel

Reputation: 891

The issue is that you are making IN subquery as a correlated subquery which in turn is causing the performance issue.For each rows from the main clause (or outer from) the IN subquery is being executed and thus it is taking time.Each execution of Inner subquery could be expensive based on size of tables and most probably it will use a nested loop join for in subquery. Also if number of rows from outer query are quite high then the expensive IN subquery is being executed lots of time and thus degrading the performance of query.I would suggest that have a look at the logic once again and make sure that it is giving correct data.

See below.I have tried to reproduce your issue.The cost for this query is 2045 which is very huge and off course it is going to take time. It clocks around 10 million io's 176 sec of CPU and 48 seconds elapsed time.

That's not it.I expected to have just 10 productid(I am sure you alos expect to have just 10 defectcodes) but i could see lots of other productids just because of the correlated subquery.

set statistics io,time on

go

select * from Sales.SalesOrderHeader soh inner join sales.SalesOrderDetail sod
on sod.SalesOrderID=soh.SalesOrderID
where sod.ProductID in (
select top 10 ProductID
from Sales.SalesOrderHeader soh1 inner join sales.SalesOrderDetail sod
on sod.SalesOrderID=soh1.SalesOrderID
where datepart(yy,soh.OrderDate) = DATEPART(yy,soh1.OrderDate)
and datepart(mm,soh.OrderDate) = DATEPART(mm,soh1.OrderDate)

)

   order by sod.ProductID

Upvotes: 1

James
James

Reputation: 2841

Follow these instructions to see it SQL Server recommends any indexes:

Place the query in a query window in Microsoft SQL Server Management Studio.

Instead of running it, press CTRL+L (or select Query > Display Estimated Execution Plan from the menus).

Then in the Execution Plan tab, you may see some text that is green and starts with "Missing Index (Impact xx.xxxx%)..." if you do, right click that text and select Missing Index Details. This will open a new window with the suggested index definition (you have to supply the index name). Example:

/*
Missing Index Details from SQLQuery3.sql - (local).master (sa (55))
The Query Processor estimates that implementing the following index could improve the query cost by 99.3783%.
*/

/*
USE [DatabaseName]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Table] ([Column1])
INCLUDE ([Column2],[Column2])
GO
*/

Replace '<Name of Missing Index, sysname,>' with a valid index name and create the index.

These steps can be repeated until no indexes are suggested.

Beyond that, using SubString in your where clause is probably a problem.

This:

WHERE SUBSTRING(T2.DateTime, 1, 4) = SUBSTRING(T1.DateTime, 1, 4) AND SUBSTRING(T2.DateTime, 5, 2) = SUBSTRING(T1.DateTime, 5, 2) AND SUBSTRING(T2.DateTime, 7, 2) = SUBSTRING(T1.DateTime, 7, 2)

Could definately be this:

WHERE SUBSTRING(T2.DateTime, 1, 8) = SUBSTRING(T1.DateTime, 1, 8)

Or even better:

WHERE LEFT(T2.DateTime, 8) = Left(T1.DateTime, 1, 8)

Or even better (can allow more index usage):

WHERE T2 LIKE Left(T1.DateTime, 1, 8) + '%'

And if these date strings are only 8 characters long, it could be this:

WHERE T2.DateTime = T1.DateTime

If the query still does not run in the desired time, it would be a good move to restructure the data such that SUBSTRING or LEFT are not required in the where clause... i.e. break out the date in it's own column.

Upvotes: 0

Related Questions