Reputation: 11
I am trying to break the below query from past 3 days but not able to modify it at all as group by clause is there at the end. the query is getting executed in 1 hour returning 30K records, Table a contains tens of millions of records. Indexes are all in place for all tables. Below is the masked query:
DECLARE @Variable1 VARCHAR(6) = '199101'
SELECT
A.Col1,
'A' as Col2_Name,
A.Col3,
B.Col_05,
A.Col4,
C.Col_01 AS Col_01_Name,
SUBSTRING(A.Col5,1,4) AS Col5_Name,
D.Col_12,
A.Col6 AS Col6,
A.Col7,
SUBSTRING(A.Col8,1,4) AS Col8_Name,
A.Col9,
E.Col_01,
@Variable1 as 'ACC_FROM'
,@Variable1 AS 'ACC_TO'
,'' AS [Start_DATE],
'' AS [End_DATE],
@Variable1 AS 'PERIOD',
@Variable1 as Var_Name,
A.Col10,
A.Col11,
SUM(A.Col12) AS 'A1',
0 AS 'B1',
0 AS 'C1',
0 AS 'D1',
0 AS 'E1',
0 AS 'F1',
SUM(A.Col12) AS 'TOTAL'
FROM TableA A
LEFT JOIN TableB B ON A.Col3 = B.Col_3 and B.Col_11=A.Col11 AND B.Col_13=A.Col13 AND A.Col1 = B.Col1 and A.Col7 = B.Col_7
LEFT JOIN TableC C ON A.Col4 = LTRIM(RTRIM(C.Col_04))
LEFT JOIN TableD D ON SUBSTRING(A.Col5,1,4) = SUBSTRING(D.Col_05,1,4) and A.Col1 = D.Col_01 and A.Col7 = D.Col_07
LEFT JOIN TableE E ON A.Col9 = E.Col_09
WHERE A.Col14 = 2
AND A.Col15 = 'C'
AND A.Col16 <= @Variable1
GROUP BY
A.Col1,
A.Col3,
B.Col_05,
A.Col4,
C.Col_01,
SUBSTRING(A.Col5,1,4),
D.Col_12,
A.Col6,
A.Col7,
SUBSTRING(A.Col8,1,4),
A.Col9,
E.Col_01,
A.Col10,
A.Col11
HAVING SUM(A.Col12)<>0
Can anyone please help me in breaking the query which can reduce the execution time?
Upvotes: 1
Views: 1453
Reputation: 2220
You've received quite a few comments with good suggestions. I'll address a few of them with more detail.
Execution Plans If you're generally stuck as a read-only user (like me in my organization), then obtaining query execution plans (and understanding them!) can be difficult. To work around your permissions level, the easiest thing to do is obtain a copy of your database and install it on your local machine where you have full permissions. There you can tweak your query and try different things. Keeping a copy of your database(s) on your machine (and keeping them secure, I might add) will allow you to remove the effect of any network issues that may be affecting your query performance in inconsistent ways.
Search ARGuments One of the comments mentioned non-SARGable predicates. You can find more information on these here. You would not believe the effect that following this article's guidelines can have on the performance of your queries!
Indexes In terms of indexing, you said all the appropriate indexes are in place. If that's true and you know your indexes can't get any better than they are, ignore the rest of this paragraph. If you're not 100% sure everything you need is there, take a look at this as it has some very dry but useful information on indexes. Here's a bit of an easier read (but less comprehensive) on the same topic.
Specific Suggestions Looking at your query, you're grouping by and joining to the results of expressions. Doing that can be very taxing, especially when there are millions of records involved. In these kinds of scenarios, it's good to start by filtering out as many of your undesirable records as possible, before you start applying your more specific filters that require large amounts of processing power. The fix? Take some of your simpler joins and use those in a temporary table to start off with a smaller set of records to filter. In your temporary table(s), you can also create fields that represent the results of these expressions you're using in your grouping and join statements. For instance, you could have a trimmed copy of col_04 (i.e., col_04_trimmed) in a temporary table as well as the first four characters of Col5 (Col5_first_4). Temporary tables take a little more upfront processing to build, but once their built you can often query and work with their contents much faster than you can a ton of data being manipulated on the fly. When you're done with the query, drop your temporary tables and go about your business.
Also, when you have a variable that will always represent a number, store it as a numeric (integer, float, whatever) variable. It will save a little time on the implicit conversions, especially across millions of rows.
Finally, you say
Having SUM(A.Col12)<>0
Instead try
Having SUM(A.Col12) > 0
I hope this helps!
Upvotes: 2