Reputation: 875
I wrote a fairly simple stored procedure that accepts some data, selects a value, inserts a couple records and then returns that value. But execution is too long in our production environment where I might eventually want it to run a few hundred thousand times in a day and it's adversely affecting other processes even when we're only running it say 30000 times.
I started by looking at the queries and adding an index on the date field that's used in where clauses. Then I ran SQL Server Profiler -- feeding the results into Tuning Advisor and implementing the indexing suggestions that it came up with. In the past, I've seen that tool call for really ugly indexes but this time it just wanted a single addition which made sense, so I added it. Each of those steps helped. But still too slow.
It was easy to figure out that the first query was the holdup, not the two inserts where are nearly instantaneous. So here's what I had at that time, including subquery run-times:
--all combined, this typically takes in the range of 1200-1500 ms but occasionally spikes up to ~2200 ms
select coalesce(
(
--when the following is run, this takes ~690 ms
select MIN(maxes.imbsn)
from (
--when the following is run without the higher limiting scopes, this takes ~3600 ms
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
) maxes
where datediff(d, maxes.maxAD, GETDATE()) > 90
)
,
(
--this is rarely executed but takes ~0 ms when it is
select max(imbsn)+1 from imbsnAssignments
)
)
Based on those times, it seemed like the coalesce was mucking things up (this is a thing that I imagine I could verify with the execution plan if I had ever figured out how to read it, but I haven't -- plans remain largely opaque to my poor brain). To get rid of the coalesce, I changed up the query to:
--this runs ~480-700 ms
select MIN(maxes.imbsn)
from (
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
union
select max(imbsn)+1, getDate()
from imbsnAssignments
) maxes
where datediff(d, maxes.maxAD, GETDATE()) > 90
which is a big improvement. But it's still pretty slow.
I verified that Profiler-Tuning Advisor still didn't want me to make any changes before coming here to ask you all about it. Which I think leaves me with two approaches: 1) maintain the basic algorithm but squeeze greater efficiency out of it or 2) switch to some smarter way of obtaining the same basic effect through methods that I don't know of but one will be obvious to one of you big-brains who sniff out that I'm engaged in some kind of anti-pattern here.
Thanks in advance for your time and attention!
I'm not precisely sure what the expected format for this additional information is, but I'll give it a try. The table is:
CREATE TABLE [dbo].[imbsnAssignments](
[id] [int] IDENTITY(1,1) NOT NULL,
[imbsn] [int] NOT NULL,
[assignmentDate] [date] NOT NULL,
[jobCode] [varchar](10) NOT NULL,
[name] [varchar](45) NOT NULL,
[a1] [varchar](45) NOT NULL,
[a2] [varchar](45) NOT NULL,
[a3] [varchar](45) NOT NULL,
[a4] [varchar](45) NOT NULL,
[city] [varchar](40) NOT NULL,
[state] [char](10) NOT NULL,
[zip] [varchar](10) NOT NULL,
[batchIdent] [varchar](256) NOT NULL,
CONSTRAINT [PK_imbsnAssignments] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
and I suspect this isn't the 'right' way to show the indexes but:
TableName IndexName IndexType
imbsnAssignments PK_imbsnAssignments CLUSTERED
imbsnAssignments IX_imbsnAssignments_assignmentDate NONCLUSTERED
imbsnAssignments _dta_index_imbsnAssignments_36_149575571__K2_3 NONCLUSTERED
Execution plan:
|--Stream Aggregate(DEFINE:([Expr1013]=MIN([partialagg1018])))
|--Concatenation
|--Stream Aggregate(DEFINE:([partialagg1018]=MIN([IMB].[dbo].[imbsnAssignments].[imbsn])))
| |--Filter(WHERE:([Expr1003]<dateadd(day,(-90),getdate())))
| |--Stream Aggregate(GROUP BY:([IMB].[dbo].[imbsnAssignments].[imbsn]) DEFINE:([Expr1003]=MAX([IMB].[dbo].[imbsnAssignments].[assignmentDate])))
| |--Index Scan(OBJECT:([IMB].[dbo].[imbsnAssignments].[_dta_index_imbsnAssignments_36_149575571__K2_3]), ORDERED FORWARD)
|--Stream Aggregate(DEFINE:([partialagg1018]=MIN([Expr1009])))
|--Compute Scalar(DEFINE:([Expr1009]=[Expr1008]+(1)))
|--Stream Aggregate(DEFINE:([Expr1008]=MAX([IMB].[dbo].[imbsnAssignments].[imbsn])))
|--Top(TOP EXPRESSION:((1)))
|--Index Scan(OBJECT:([IMB].[dbo].[imbsnAssignments].[_dta_index_imbsnAssignments_36_149575571__K2_3]), ORDERED BACKWARD)
Upvotes: 0
Views: 75
Reputation: 925
First I would suggest taking your where clause closer to the actual table in your second example, and I would check the execution plan to see what its doing, maybe calculate what 90 days back is check it, rather than DATEDIFF every single maxAD value. And just lets change the union to a union all so it doesn't have to check for uniqueness between the datasets.
select MIN(maxes.imbsn)
from (
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
having max(assignmentdate) < dateadd(-90, d, GETDATE())
union all
select max(imbsn)+1, getDate()
from imbsnAssignments
) maxes
Upvotes: 1