Rahul Patel
Rahul Patel

Reputation: 317

SQL query optimization in SQL Server

In SQL Server, query execution plan, two operations (Parallelism and Hash Match) are getting 30 % and 45 % of the total cost.

What does it mean of Parallelism and Hash Match?

For Parallelism, I have checked on this link that Number of Parallelism can effect performance of query, how do I check what is number of degree of parallelism of server?

How to reduce this cost? I have no idea how can I reduce this cost.

enter image description here

enter image description here

My query is returning 42 million rows and joining 5 tables; no where conditions, no group by, order by clauses.

I have non-clustered indexes on join columns.

My query is:

SELECT
    [INV].SKU AS [INV_SKU],
    [INV].LOCATION_ID AS [INV_LOCATION_ID],
    [INV].DATE AS [INV_BALANCE_DATE],
    [INV].COST AS INV_COST,
    [ITEM].ITEM_ID,
    [ITEM].ITEM_NAME,
    [SPITEM].ITEMNumber As SP_ITEMID,
    [SPITEM].NAME,
    [SPITEM_DEPT].[SKEY],
    [SPITEM_DEPT].[DEPT_NAME],
    [Time].[DATE] AS [CAL_DATE],
    [Time].[CAL_NAME] AS [CAL_NAME],
    [Time].[YEAR_NAME] AS [YEAR_NAME],
    [Time].[YEAR_NUM] AS [YEAR_NUM],
    [Time].[YEAR_START_DT] AS [YEAR_START_DT],
    ISNUL(convert(INT, convert(varchar, [Time].[WEEK_END_DT], 112)), 0) AS [WEEK_END_DT_SKEY],
    CASE
       WHEN [ITEM].DEPARTMENT IS NULL
          THEN (CASE 
                   WHEN [SPITEM_DEPT].SPITEM_DEPT_NAME = 'UNSPECIFIED' 
                      THEN 0 
                   ELSE [SPITEM_DEPT].SPITEM_DEPT_NAME 
                END)
       ELSE [ITEM].DEPARTMENT 
    END AS [DEPARTMENTNUM],
    CASE 
       WHEN [ITEM].[DEPARTMENT_DESCRIPTION] IS NULL
          THEN [SPITEM_DEPT].[DESCRIPTION]   
          ELSE [ITEM].[DEPARTMENT_DESCRIPTION] 
    END AS [ITEM_DEPARTMENT_DESC],
    [LOCATION].LOCATION_NAME,
    [LOCATION].COUNTRY,
    [LOCATION].CURRENCY,
    [CURRENCY].BASE_CURRENCY
FROM 
    [dbo].[Table1] [INV]
LEFT JOIN 
    dbo.Table2 AS [ITEM] ON ([INV].SKU = [ITEM].SKU )
LEFT JOIN 
    dbo.Table3 AS [LOCATION] ON ([INV].LOCATION_ID = [LOCATION].LOCATION_ID)
INNER JOIN 
    dbo.Table4 [Time] ON ([INV].DATE = [Time].DATE)
LEFT JOIN 
    dbo.Table5 [SPITEM] ON ([INV].SKU = SPITEM.NAME)
LEFT JOIN 
    [dbo].[Table6] [SPITEM_DEPT] ON ([SPITEM].[WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY] = [SPITEM_DEPT].[SKEY])
LEFT JOIN 
    [dbo].[Table7] [CURRENCY] ON ([INV].DATE BETWEEN [CURRENCY].BEGIN_DT AND [CURRENCY].END_DT AND [LOCATION].CURRENCY= [CURRENCY].LOCAL_CURRENCY)

and we have non-clustered index on joining columns too.

Please suggest possible solutions, I can try.

I am newbie to SQL Server query optimization.

Upvotes: 0

Views: 362

Answers (2)

pacreely
pacreely

Reputation: 1931

Do some tests with these indexes (not all at once).

Focus on your larger tables first.

Also in your Join to table7 try and change your BETWEEN statement to sample below

[INV].DATE >= [CURRENCY].BEGIN_DT 
AND 
[INV].DATE <= [CURRENCY].END_DT



CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table1](SKU)
CREATE NONCLUSTERED INDEX idx2 ON [dbo].[Table1](LOCATION_ID)
CREATE NONCLUSTERED INDEX idx3 ON [dbo].[Table1](DATE) INCLUDE(COST)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table2](SKU) INCLUDE(ITEM_ID,ITEM_NAME,DEPARTMENT,DEPARTMENT_DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table3](LOCATION_ID,CURRENCY) INCLUDE(LOCATION_NAME,COUNTRY)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table4](DATE) INCLUDE(CAL_NAME,YEAR_NAME,YEAR_NUM,YEAR_START_DT,WEEK_END_DT)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table5](NAME,WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY) INCLUDE(ITEMNumber)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table6](SKEY) INCLUDE(DEPT_NAME,SPITEM_DEPT_NAME,DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table7](LOCAL_CURRENCY,BEGIN_DT,END_DT) INCLUDE(BASE_CURRENCY)

If you've exhausted all of your possible indexing options then you can always limit the number of processors your query uses by using the MAXDOP option at the base of your query. I'm not a fan of this because your systems can change over time and you'll have to keep an eye on it accordingly.

What would be interesting to see is if the HASH MATCH Actual Figures are massively different from the Estimated Figures, this might mean your statistics are out of date and your existing indexes need Rebuilding.

Upvotes: 0

Geovanny Hernandez
Geovanny Hernandez

Reputation: 237

The number of rows is huge and for me the main problem is that you don't have any where conditions, even if you have indexes but not predicate is a bad scenario. The number of degree parallelism is set up by your DBA, it indicates the threshold for the operator will work in a parallel or serial way.

https://technet.microsoft.com/es-es/library/ms181007%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396

I recommend you review this free book of Grant Fritchey about Execution Plan: https://www.red-gate.com/library/sql-server-execution-plans-2nd-edition

However, my pieces of advises are:
1) Adding filter in the query (Where)
2) Review your indexes strategy

Upvotes: 1

Related Questions