gregs
gregs

Reputation: 635

Optimize SQL Query to avoid Hash Match (Aggregate)

I have a SQL query that takes 7 minutes+ to return results. I'm trying to optimize as much as possible and the Execution plan loses 82% of the time on a Hash Match (Aggregate). I've done some searching and looks like using an "EXISTS" would help to resolve, but I haven't figured out the syntax of the query to make it work. Here's the query:

select dbo.Server.Name,
                dbo.DiskSpace.Drive,
                AVG(dbo.DiskSpace.FreeSpace) as 'Free Disk Space',
                AVG(dbo.Processor.PercentUsed) as 'CPU % Used',
                AVG(dbo.Memory.PercentUtilized) as '% Mem Used'

                from Server
                join dbo.DiskSpace on dbo.Server.ID=DiskSpace.ServerID
                join dbo.Processor on dbo.Server.ID=Processor.ServerID
                join dbo.Memory on dbo.Server.ID=dbo.Memory.ServerID

                where
                dbo.Processor.ProcessorNum='_Total' 
                    and dbo.Processor.Datetm>DATEADD(DAY,-(1),(CONVERT (date, GETDATE()))) 
                    and (      dbo.Server.Name='qp-ratking' 
                            or dbo.Server.Name='qp-hyper2012' 
                            or dbo.Server.Name='qp-hyped' 
                            or dbo.Server.Name='qp-lichking')
                Group By dbo.server.name, Dbo.DiskSpace.Drive
                Order By Dbo.Server.Name, dbo.DiskSpace.Drive;

How do I reduce/eliminate the joins using EXISTS? Or if there is a better way to optimize, I'm up for that too. Thanks

Upvotes: 10

Views: 37442

Answers (4)

Nick
Nick

Reputation: 29

At the very least I'd start with getting rid of all those OR clauses.

AND (dbo.Server.Name='qp-ratking'
      OR dbo.Server.Name='qp-hyper2012'
      OR dbo.Server.Name='qp-hyped'
      OR dbo.Server.Name='qp-lichking')

and replace with

AND dbo.Server.Name in ('qp-ratking','qp-hyper2012','qp-hyped','qp-lichking')  

I'm not sure about converting everything to CTEs though. You can't index CTEs and I'm yet to come across an occasion where CTEs outperform a regular query. Your initial query seemed well formed apart from the over use of OR as mentioned above, so I'd be looking at indexes next.

Upvotes: 2

gregs
gregs

Reputation: 635

A co-worker broke down the query and pulled out data in smaller chunks so there wasn't as much processing of the data returned by the joins. It cut it down to less than 1 second return. New Query:

WITH tempDiskSpace AS
(
SELECT dbo.Server.Name
      ,dbo.DiskSpace.Drive
      ,AVG(dbo.DiskSpace.FreeSpace) AS 'Free Disk Space'

FROM dbo.DiskSpace
      LEFT JOIN dbo.Server ON dbo.DiskSpace.ServerID=Server.ID

WHERE dbo.DiskSpace.Datetm>DATEADD(DAY,-(1),(CONVERT (date, GETDATE())))
AND (dbo.Server.Name='qp-ratking'
      OR dbo.Server.Name='qp-hyper2012'
      OR dbo.Server.Name='qp-hyped'
      OR dbo.Server.Name='qp-lichking')

GROUP BY Name, Drive
)
,tempProcessor
AS
(
SELECT dbo.Server.Name
      ,AVG(dbo.Processor.PercentUsed) AS 'CPU % Used'

FROM dbo.Processor
      LEFT JOIN dbo.Server ON dbo.Processor.ServerID=Server.ID

WHERE dbo.Processor.Datetm>DATEADD(DAY,-(1),(CONVERT (date, GETDATE())))
AND dbo.Processor.ProcessorNum='_Total'
AND (dbo.Server.Name='qp-ratking'
      OR dbo.Server.Name='qp-hyper2012'
      OR dbo.Server.Name='qp-hyped'
      OR dbo.Server.Name='qp-lichking')

GROUP BY Name
)
,tempMemory
AS
(
SELECT dbo.Server.Name
      ,AVG(dbo.Memory.PercentUtilized) as '% Mem Used'

FROM dbo.Memory
      LEFT JOIN dbo.Server ON dbo.Memory.ServerID=Server.ID

WHERE dbo.Memory.Datetm>DATEADD(DAY,-(1),(CONVERT (date, GETDATE())))
AND (dbo.Server.Name='qp-ratking'
      OR dbo.Server.Name='qp-hyper2012'
      OR dbo.Server.Name='qp-hyped'
      OR dbo.Server.Name='qp-lichking')

GROUP BY Name
)

SELECT tempDiskSpace.Name, tempDiskSpace.Drive, tempDiskSpace.[Free Disk Space], tempProcessor.[CPU % Used], tempMemory.[% Mem Used]
FROM tempDiskSpace
LEFT JOIN tempProcessor ON tempDiskSpace.Name=tempProcessor.Name
LEFT JOIN tempMemory ON tempDiskSpace.Name=tempMemory.Name
ORDER BY Name, Drive;

Thanks for all the suggestions.

Upvotes: 8

Gouri Shankar Aechoor
Gouri Shankar Aechoor

Reputation: 1581

The statement looks reasonably structured and do not see a huge scope for optimization provided the per-requisits are addressed such as

  1. Check Index Fragmentation and ensure all Indexes are maintained
  2. Check if Statistics are up to date
  3. If dirty ready are acceptable then worth consider applying WITH (NOLOCK) on the tables.
  4. If the query allows declaring variables then moving the DATEADD out of the Filter statement as below can be beneficial.

Hope this helps.

-- Assuming Variables can be declared see the script below. 
-- I made a few changes per my coding standard only to help me read better.

DECLARE @dt_Yesterdate DATE

SET @dt_Yesterdate = DATEADD(DAY, -(1), CONVERT (DATE, GETDATE()))

SELECT s.Name,
       ds.Drive,
       AVG(ds.FreeSpace) AS 'Free Disk Space',
       AVG(P.PercentUsed) AS 'CPU % Used',
       AVG(m.PercentUtilized) AS '% Mem Used'
FROM Server s
     JOIN dbo.DiskSpace AS ds
         ON s.ID = ds.ServerID
     JOIN dbo.Processor AS p
         ON s.ID = p.ServerID
     JOIN dbo.Memory AS m
         ON s.ID = m.ServerID
WHERE P.ProcessorNum = '_Total'
  AND P.Datetm > @dt_Yesterdate
  AND s.Name IN ('qp-ratking', 'qp-hyper2012', 'qp-hyped','qp-lichking')
GROUP BY s.name, ds.Drive
ORDER BY s.Name, ds.Drive;

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1270061

I would start by checking the indexes. Are all the keys used in the join defined as primary keys? Or do they at least have indexes?

Then, additional indexes on Processor and Server might help:

create index idx_Processor_ProcessorNum_Datetm_ServerId on ProcessorNum(ProcessorNum, Datetm, ServerId);
create index idx_Server_Name_ServerId on Server(Name, ServerId)

Upvotes: 2

Related Questions