ValYouW
ValYouW

Reputation: 749

TSQL Performance issues using DATEADD in where clause

I have a query using the DATEADD method which takes a lot of time. I'll try to simplify what we do. We are monitoring tempretures and every 5 minutes we store the highest temp and lowest temp in table A

Date          |  Time      | MaxTemp | MinTemp
2011-09-18 | 12:05:00 | 38.15        | 38.099
2011-09-18 | 12:10:00 | 38.20        | 38.10
2011-09-18 | 12:15:00 | 38.22        | 38.17
2011-09-18 | 12:20:00 | 38.21        | 38.20
...
2011-09-19 | 11:50:00 | 38.17        | 38.10
2011-09-19 | 12:55:00 | 38.32        | 38.27
2011-09-19 | 12:00:00 | 38.30        | 38.20

Date/Time columns are of type date/time (and not datetime)

In another table (Table B) we store some data for the entire day, where a day is from NOON (12PM) to noon (not midnight to midnight).

So table B columns include: Date (date only no time)
ShiftManager
MaxTemp (this is the max temp for the entire 24 hours starting at that date noon till next day noon)
MinTemp

I get table B with all the data and just need to update the MaxTemp and MinTemp using table A
For example:For 09/18/2011 I need the maximum temp reading that was between 09/18/2011 12PM and 09/19/2011 12PM.
In the TableA sample we have above, the returend result would be 38.32 as it is the MAX(MaxTemp) for the desired period.

The SQL I'm using:

update TableB
set MaxTemp = (
select MAX(HighTemp) from TableA
where
(Date=TableB.Date and Time > '12:00:00') 
or 
(Date=DATEADD(dd,1,TableB.Date) and Time <= '12:00:00')
)

And it takes a lot of time (if I remove the DATEADD method it is quick).

Here is a simplified sample that shows the data I have and the expected result:

DECLARE @TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2));
DECLARE @TableB TABLE ([Date] DATE, MaxTemp DECIMAL(6,2));

INSERT @TableA VALUES
('2011-09-18','12:05:00',38.15),
('2011-09-18','12:10:00',38.20),
('2011-09-18','12:15:00',38.22),
('2011-09-19','11:50:00',38.17),
('2011-09-19','11:55:00',38.32),
('2011-09-19','12:00:00',38.31),
('2011-09-19','12:05:00',38.33),
('2011-09-19','12:10:00',38.40),
('2011-09-19','12:15:00',38.12),
('2011-09-20','11:50:00',38.27),
('2011-09-20','11:55:00',38.42),
('2011-09-20','12:00:00',38.16);

INSERT @TableB VALUES
('2011-09-18', 0),
('2011-09-19', 0);

-- This is how I get the data, now I just need to update the max temp for each day

with TableB(d, maxt) as
(
select * from @TableB
)
update TableB
set maxt = ( 
select MAX(HighTemp) from @TableA 
where
(Date=TableB.d and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.d) and Time <= '12:00:00')
)

select * from @TableB

Hope I was able to explian myself, any ideas how can I do it differently? Thx!

Upvotes: 0

Views: 4091

Answers (3)

gbn
gbn

Reputation: 432271

Functions on column usually kill performance. So can OR.

However, I assume you want AND not OR because it is a range.

So, applying some logic and having just one calculation

update TableB
 set MaxTemp =
 (
    select MAX(HighTemp) from TableA
    where
    (Date + Time - 0.5 = TableB.Date)
 )

(Date + Time - 0.5) will change noon to noon to be midnight to midnight (0.5 = 12 hours). More importantly, you can make this a computed column and index it

More correctly, Date + Time - 0.5 is DATEADD(hour, -12, Date+Time) assuming Date and Time are real dates/times and not varchar...

Edit: this answer is wrong but I'll leave it up as "what not to do"

See this for more:

Upvotes: 3

paparazzo
paparazzo

Reputation: 45096

You may need to use -12 depending on date as start date or end date for the noon to noon internal.

    update tableA 
    set tableAx.MaxTemp = MAX(TableB.HighTemp)
    from tableA as tableAx
    join TableB 
    on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date) 
    group by tableAx.Date 

Because of the 12 hour offset not sure how much would would gain by putting TableB Date plus Time in a DateTime field directly. Cannot get away from the DATEADD and the output from a functions is not indexed even if the parameters going into the function are indexed. What you might be able to to is create a computed column that = date + time +/- 12h and index that column.

Like the recommendation from Arron to only update those without values.

    update tableA 
    set tableAx.MaxTemp = MAX(TableB.HighTemp)
    from tableA as tableAx
    join TableB 
    on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date) 
    where tableAx.MaxTemp is null 
    group by tableAx.Date

or an insert of new dates

    insert into tableA (date, MaxTemp) 
    select CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]), as Date) as [date] , MAX(TableB.HighTemp) as [MaxTemp]
    from tableA as tableAx
    right outer join TableB 
    on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date) 
    where TableB.Date is null 
    group by CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)

Upvotes: 0

anon
anon

Reputation:

This would probably be a lot easier if you used a single SMALLDATETIME column instead of separating this data into DATE/TIME columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE/TIME data as strings. Please specify the version of SQL Server and the actual data types being used.

DECLARE @d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));

INSERT @d VALUES
('2011-09-18','12:05:00',38.15,38.099),
('2011-09-18','12:10:00',38.20,38.10),
('2011-09-18','12:15:00',38.22,38.17),
('2011-09-18','12:20:00',38.21,38.20),
('2011-09-19','11:50:00',38.17,38.10),
('2011-09-19','12:55:00',38.32,38.27),
('2011-09-19','12:00:00',38.30,38.20);

SELECT '-- before update';
SELECT * FROM @d;

;WITH d(d,t,dtr,maxt) AS
(
    SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8), 
        [Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM @d 
),
d2(dtr, maxt) AS 
(
    SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
    GROUP BY CONVERT([Date], dtr)
)
UPDATE d SET maxt = d2.maxt FROM d
    INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);

SELECT '-- after update';
SELECT * FROM @d;

Results:

-- before update

2011-09-18  12:05:00    38.150  38.099
2011-09-18  12:10:00    38.200  38.100
2011-09-18  12:15:00    38.220  38.170
2011-09-18  12:20:00    38.210  38.200
2011-09-19  11:50:00    38.170  38.100
2011-09-19  12:55:00    38.320  38.270
2011-09-19  12:00:00    38.300  38.200

-- after update

2011-09-18  12:05:00    38.220  38.099
2011-09-18  12:10:00    38.220  38.100
2011-09-18  12:15:00    38.220  38.170
2011-09-18  12:20:00    38.220  38.200
2011-09-19  11:50:00    38.220  38.100
2011-09-19  12:55:00    38.320  38.270
2011-09-19  12:00:00    38.320  38.200

Presumably you want to update the MinTemp as well, and that would just be:

;WITH d(d,t,dtr,maxt,mint) AS
(
    SELECT [Date], [Time], DATEADD(HOUR, -12,
         CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112) 
         + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
    FROM @d 
),
d2(dtr, maxt, mint) AS 
(
    SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
    GROUP BY CONVERT([Date], dtr)
)
UPDATE d
    SET maxt = d2.maxt, mint = d2.maxt
    FROM d
    INNER JOIN d2
        ON d.dtr >= d2.dtr
        AND d.dtr < DATEADD(DAY, 1, d2.dtr);

Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:

USE [tempdb];
GO

CREATE TABLE dbo.d
(
    [Date] SMALLDATETIME, 
    MaxTemp DECIMAL(6,3), 
    MinTemp DECIMAL(6,3),
    RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
);

CREATE INDEX rd ON dbo.d(RoundedDate);

INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
('2011-09-18 12:05:00',38.15,38.099),
('2011-09-18 12:10:00',38.20,38.10),
('2011-09-18 12:15:00',38.22,38.17),
('2011-09-18 12:20:00',38.21,38.20),
('2011-09-19 11:50:00',38.17,38.10),
('2011-09-19 12:55:00',38.32,38.27),
('2011-09-19 12:00:00',38.30,38.20);

Then your update is this simple, and the plan is much nicer:

;WITH g(RoundedDate,MaxTemp)
AS
(
    SELECT RoundedDate, MAX(MaxTemp)
        FROM dbo.d
        GROUP BY RoundedDate
)
UPDATE d
    SET MaxTemp = g.MaxTemp
    FROM dbo.d AS d
    INNER JOIN g
    ON d.RoundedDate = g.RoundedDate;

Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.

Upvotes: 1

Related Questions