Reputation: 5
So, the original task was to create a report that identified the number of concurrent transactions for a given 5 min interval and provide a high-water mark week by week of the most concurrent transactoins. I've solved the problem but the query tends to run around 3-3.5hrs for 3 months of data. This is not really what I was hoping for.
For the purposes of this exercise, there are only two Columns I care about:
Transaction_Data.DateTime (Start of Transaction: e.g. 2018-01-01 23:59:59.999)
Transaction_Data.Duration (Integer seconds: e.g. 272)
The tricky part of this query is that I need to take DateTime + Duration and break it up into 5 min intervals. So, if the timestamp is 10:02:00 and the Duration is 715 seconds (ending at 10:13:55), then I need to count the transaction for the 10:00, 10:05, and 10:10 intervals.
To accomplish this, I use a CTE to populate a temporary table that has all the intervals for the range of time in question and then use a Cross Join query to place the transactions into their various intervals.
Here is the query:
DECLARE @Times Table (DateTime DateTime)
DECLARE @StartDate AS DATETIME
, @EndDate AS DATETIME
SET @StartDate = '2018-06-01'
SET @EndDate = '2018-08-31 23:59:59.999';
WITH DateIntervalsCTE AS
(
SELECT 0 i, @StartDate AS Date
UNION ALL
SELECT i + 5, DATEADD(minute, i, @StartDate )
FROM DateIntervalsCTE
WHERE DATEADD(minute, i, @StartDate ) < @EndDate
)
INSERT INTO @Times (DateTime)
SELECT DISTINCT Date
FROM DateIntervalsCTE
OPTION(MAXRECURSION 32767);
select Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101) as 'WeekOf'
, Max(Count) as 'MaxConcur'
from
(select t.DateTime
, count(t.DateTime) as 'Count'
from Transaction_Detail TD
cross join @Times t
where t.DateTime
between
DateAdd(ss,-(((DatePart(mi,TD.DateTime)%5)*60)+DatePart(ss,TD.DateTime)),DateAdd(ms,-DatePart(ms,TD.DateTime),TD.DateTime))
and DateAdd(ss,TD.Duration,TD.DateTime)
group by
t.DateTime) as IntData
group by
Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101)
With respect to my DateAdd stuff in the Where clause, I'm trying to round the start time down to the nearest 5 minute interval so that the CrossJoin will match for the starting interval.
With respect to the Data, I can't change anything about the structure as the Data is generated by Third Party application. In general, I consider the whole SQL Server to be Read-Only so I generally avoid creating any kind of static tables on it. Since I support a lot of these DBs for various clients, the ideal here is that the code can simply be pasted into a SSMS window and executed.
With respect to the performance issues, the CTE part itself runs fast enough, but the way the query runs I expect that comparing 26k (3-months) of 5-min intervals to 325k transaction records is the true root of my problem. What's 8.5 billion operations between friends, right?
In the interest of full disclosure, while I write a lot of t-sql and have for a lot of years, this is the first time I've made use of CTE and CrossJoins. It is entirely possible I screwed something up and haven't been able to detect it, but from what diagnostic stuff I've done, it appears to be reporting accurately, albeit slooowly.
What I'm hoping for with this request is for someone with more T-SQL knowledge then me to point out a better way to accomplish what I'm trying to accomplish that runs in terms of minutes rather than hours. While I wouldn't turn my nose up at a rewritten solution, I would be quite happy to simply be pointed in the direction of a better technique.
If you've read this far, thank you for your time.
-J.V.
Sample Input
DateTime Duration
2018-06-01 00:04:55.223 57
2018-06-01 00:04:56.223 58
2018-06-01 00:08:37.180 62
2018-06-01 00:08:37.180 62
2018-06-01 00:20:29.183 10
2018-06-01 00:28:38.423 0
2018-06-01 00:28:53.190 15
2018-06-01 00:31:52.690 195
2018-06-01 00:32:20.917 209
2018-06-01 00:32:54.690 756
Note: This is a very tiny sample of what the input looks like.
Sample Output
WeekOf MaxConcur
05/27/2018 101
06/03/2018 169
06/10/2018 189
06/17/2018 148
06/24/2018 186
07/01/2018 218
07/08/2018 222
07/15/2018 210
07/22/2018 219
07/29/2018 225
08/05/2018 243
08/12/2018 231
08/19/2018 253
08/26/2018 220
Final Solution
Firstly, thanks to everyone who replied. This was really awesome for me and I learned some interesting ideas on how to solve SQL problems. In particular thanks to KumarHarsh for bringing me close enough to get to a final resolution that brings up the data in 15s, which is far faster than I hoped for. Here is the final query (apologies if I'm not doing this right but I felt the final answer needed to be shared):
-- Set Up Time Range
declare @minDate DateTime='2018-06-01 00:00:00.000'
declare @maxDate DateTime='2018-08-31 23:59:59.999'
-- Build Temporary Interval Table
create table #TimesTable (
[DateTime] DateTime not null
, [DateCol] Date not null
)
-- Populate Interval Table (5min Intervals)
insert into #TimesTable
select dateadd(minute,(RowNum-1)*5,@minDate) as 'DateTime'
, cast(dateadd(minute,(RowNum-1)*5,@minDate) as Date) as 'Date'
from (
select ROW_NUMBER()over(order by (select null)) as 'RowNum'
from master..spt_values a
CROSS JOIN master..spt_values b
) as TT
where cast(dateadd(minute,RowNum*5,@minDate) as DateTime) < @maxDate
-- Build Table Indexes
create clustered index ix_datecol on #TimesTable ([DateTime],[DateCol])
-- Query the Data
select Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101) as 'WeekOf'
, Max(Count) as 'MaxConcur'
from (
select TData.DateTime
, TData.[DateCol]
, Count(TData.DateTime) as 'Count'
from dbo.Transaction_Detail TD
outer apply (
select TT.DateTime
, TT.[DateCol]
from #TimesTable as TT
where TT.DateTime
between
TD.DateTime and DateAdd(ss,TD.Duration,TD.DateTime)
) as TData
group by TData.DateTime,TData.[DateCol]
) as IntData
where Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101) is not null
group by Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101)
order by 'WeekOf'
drop table #TimesTable
From Kumar's example I had to make a few changes:
Upvotes: 0
Views: 632
Reputation: 5094
Regarding your script,
Why use distinct in CTE,remove it or your CTE is bad
Do not use table variable,use temp table
Group By "Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101)
"
Group by can be replace with ,notice extra () around , (DatePart(weekday,getdate())+1)
select DateAdd(day,-(DatePart(weekday,getdate())+1),Convert(varchar(10),getdate(),101))
select DateAdd(day,-(DatePart(weekday,getdate())+1),cast(getdate() as date))
I am not sure but it semm that this part is long and wrong.
Do only this change and check.
My way,
First create and time table table. This is One time time table creation.
You can create in whichever way you want
declare @minDate Datetime='2005-01-01'
create table TimesTable ([DateTime] DateTime not null ,[DateCol] Date not null)
insert into TimesTable
select dateadd(minute,rn*5,@minDate),cast(dateadd(minute,rn*5,@minDate) as date)
from
(
select a.number, ROW_NUMBER()over(order by (select null))rn
from master..spt_values a
CROSS JOIN master..spt_values b
)t4
Create clustered index ix_datecol on TimesTable ([DateTime],[DateCol])
-- where '2005-01-01' is any min value choose min value according to your requirement
Index can be change if it is not working and script showing some improvement.
DECLARE @StartDate AS DATETIME
, @EndDate AS DATETIME
SET @StartDate = '2018-06-01'
SET @EndDate = '2018-08-31 23:59:59.999';
select Convert(varchar(10),DateAdd(day,-DatePart(weekday,IntData.DateTime)+1,Convert(varchar(10),IntData.DateTime,101)),101) as 'WeekOf'
, Max(Count) as 'MaxConcur'
from
(select t.DateTime,t.[DateCol]
, count(t.DateTime) as 'Count'
from dbo.Transaction_Detail TD
outer apply(select t.DateTime,t.[DateCol] from TimesTable t
where t.DateTime
between
DateAdd(ss,-(((DatePart(mi,TD.DateTime)%5)*60)+DatePart(ss,TD.DateTime)),DateAdd(ms,-DatePart(ms,TD.DateTime),TD.DateTime))
and DateAdd(ss,TD.Duration,TD.DateTime)
)t
group by
t.DateTime,t.[DateCol] ) as IntData
group by
t.[DateCol]
-- In place of cross join ,try OUTER APPLY once
My script will throw error or give incorrect output in the beginning.But I am sure it can be corrected.
Understand the idea behind t.[DateCol] and tune the query accordingly.
let me know the performance if it is working.
Upvotes: 1
Reputation: 5157
Try to pre-compute everything in your CTE. e.g. [WeekOf]
value select Convert(varchar(10),DateAdd...
should come from CTE/table variable
Avoid/minimise computations in GROUP BY
and WHERE
e.g where t.DateTime between DateAdd ....
should be just a simple range condition.
You don't need a CROSS JOIN
DECLARE @Times Table (IntervalStart DateTime, IntervalEnd DateTime, [WeekOf] DATETIME)
DECLARE @StartDate AS DATETIME
, @EndDate AS DATETIME
SET @StartDate = '2018-06-01'
SET @EndDate = '2018-08-31 23:59:59.999';
WITH DateIntervalsCTE AS
(
SELECT 0 i, @StartDate AS Date
UNION ALL
SELECT i + 5, DATEADD(minute, i, @StartDate )
FROM DateIntervalsCTE
WHERE DATEADD(minute, i, @StartDate ) < @EndDate
)
INSERT INTO @Times (IntervalStart, IntervalEnd, [WeekOf])
SELECT Date, DATEADD(minute, 5, Date ), Convert(varchar(10),DateAdd(day,-DatePart(weekday,Date)+1,Convert(varchar(10),Date,101)),101)
FROM DateIntervalsCTE
OPTION(MAXRECURSION 32767);
SELECT [WeekOf], MAX( [Count] ) AS 'MaxConcur'
FROM(
SELECT t.IntervalStart, COUNT(t.IntervalStart) AS [Count], [WeekOf]
FROM Transaction_Detail AS TD
INNER join @Times AS t ON t.IntervalStart <= TD.DateTime AND DATEADD( ss, TD.Duration, TD.DateTime ) < t.IntervalEnd
GROUP BY [WeekOf], t.IntervalStart ) AS IntData
GROUP BY [WeekOf]
I have added PeriodEnd
column to simplify join condition (see point 2.)
I have added WeekOf
column computation to CTE (see point 1.)
If there is an index on Transaction_Data.DateTime
you can try adding a WHERE
clause in the nested SELECT
e.g. WHERE @StartDate <= TD.DateTime AND TD.DateTime <= @EndDate
to reduce the number of transaction records searched.
In my experience this query should not take more than 20 min with reasonable hardware.
Try smaller range of @StartDate
and @EndDate
to test
Upvotes: 0