Reputation: 749
I have a query using the DATEADD method which takes a lot of time. I'll try to simplify what we do. We are monitoring tempretures and every 5 minutes we store the highest temp and lowest temp in table A
Date | Time | MaxTemp | MinTemp
2011-09-18 | 12:05:00 | 38.15 | 38.099
2011-09-18 | 12:10:00 | 38.20 | 38.10
2011-09-18 | 12:15:00 | 38.22 | 38.17
2011-09-18 | 12:20:00 | 38.21 | 38.20
...
2011-09-19 | 11:50:00 | 38.17 | 38.10
2011-09-19 | 12:55:00 | 38.32 | 38.27
2011-09-19 | 12:00:00 | 38.30 | 38.20
Date/Time columns are of type date/time (and not datetime)
In another table (Table B) we store some data for the entire day, where a day is from NOON (12PM) to noon (not midnight to midnight).
So table B columns include:
Date (date only no time)
ShiftManager
MaxTemp (this is the max temp for the entire 24 hours starting at that date noon till next day noon)
MinTemp
I get table B with all the data and just need to update the MaxTemp and MinTemp using table A
For example:For 09/18/2011 I need the maximum temp reading that was between 09/18/2011 12PM and 09/19/2011 12PM.
In the TableA sample we have above, the returend result would be 38.32 as it is the MAX(MaxTemp) for the desired period.
The SQL I'm using:
update TableB
set MaxTemp = (
select MAX(HighTemp) from TableA
where
(Date=TableB.Date and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.Date) and Time <= '12:00:00')
)
And it takes a lot of time (if I remove the DATEADD method it is quick).
Here is a simplified sample that shows the data I have and the expected result:
DECLARE @TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2));
DECLARE @TableB TABLE ([Date] DATE, MaxTemp DECIMAL(6,2));
INSERT @TableA VALUES
('2011-09-18','12:05:00',38.15),
('2011-09-18','12:10:00',38.20),
('2011-09-18','12:15:00',38.22),
('2011-09-19','11:50:00',38.17),
('2011-09-19','11:55:00',38.32),
('2011-09-19','12:00:00',38.31),
('2011-09-19','12:05:00',38.33),
('2011-09-19','12:10:00',38.40),
('2011-09-19','12:15:00',38.12),
('2011-09-20','11:50:00',38.27),
('2011-09-20','11:55:00',38.42),
('2011-09-20','12:00:00',38.16);
INSERT @TableB VALUES
('2011-09-18', 0),
('2011-09-19', 0);
-- This is how I get the data, now I just need to update the max temp for each day
with TableB(d, maxt) as
(
select * from @TableB
)
update TableB
set maxt = (
select MAX(HighTemp) from @TableA
where
(Date=TableB.d and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.d) and Time <= '12:00:00')
)
select * from @TableB
Hope I was able to explian myself, any ideas how can I do it differently? Thx!
Upvotes: 0
Views: 4091
Reputation: 432271
Functions on column usually kill performance. So can OR.
However, I assume you want AND not OR because it is a range.
So, applying some logic and having just one calculation
update TableB
set MaxTemp =
(
select MAX(HighTemp) from TableA
where
(Date + Time - 0.5 = TableB.Date)
)
(Date + Time - 0.5)
will change noon to noon to be midnight to midnight (0.5 = 12 hours). More importantly, you can make this a computed column and index it
More correctly, Date + Time - 0.5
is DATEADD(hour, -12, Date+Time)
assuming Date
and Time
are real dates/times and not varchar...
Edit: this answer is wrong but I'll leave it up as "what not to do"
See this for more:
Upvotes: 3
Reputation: 45096
You may need to use -12 depending on date as start date or end date for the noon to noon internal.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
group by tableAx.Date
Because of the 12 hour offset not sure how much would would gain by putting TableB Date plus Time in a DateTime field directly. Cannot get away from the DATEADD and the output from a functions is not indexed even if the parameters going into the function are indexed. What you might be able to to is create a computed column that = date + time +/- 12h and index that column.
Like the recommendation from Arron to only update those without values.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where tableAx.MaxTemp is null
group by tableAx.Date
or an insert of new dates
insert into tableA (date, MaxTemp)
select CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]), as Date) as [date] , MAX(TableB.HighTemp) as [MaxTemp]
from tableA as tableAx
right outer join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where TableB.Date is null
group by CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
Upvotes: 0
Reputation:
This would probably be a lot easier if you used a single SMALLDATETIME
column instead of separating this data into DATE
/TIME
columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE
/TIME
data as strings. Please specify the version of SQL Server and the actual data types being used.
DECLARE @d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));
INSERT @d VALUES
('2011-09-18','12:05:00',38.15,38.099),
('2011-09-18','12:10:00',38.20,38.10),
('2011-09-18','12:15:00',38.22,38.17),
('2011-09-18','12:20:00',38.21,38.20),
('2011-09-19','11:50:00',38.17,38.10),
('2011-09-19','12:55:00',38.32,38.27),
('2011-09-19','12:00:00',38.30,38.20);
SELECT '-- before update';
SELECT * FROM @d;
;WITH d(d,t,dtr,maxt) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8),
[Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM @d
),
d2(dtr, maxt) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d SET maxt = d2.maxt FROM d
INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);
SELECT '-- after update';
SELECT * FROM @d;
Results:
-- before update
2011-09-18 12:05:00 38.150 38.099
2011-09-18 12:10:00 38.200 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.210 38.200
2011-09-19 11:50:00 38.170 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.300 38.200
-- after update
2011-09-18 12:05:00 38.220 38.099
2011-09-18 12:10:00 38.220 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.220 38.200
2011-09-19 11:50:00 38.220 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.320 38.200
Presumably you want to update the MinTemp as well, and that would just be:
;WITH d(d,t,dtr,maxt,mint) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12,
CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112)
+ ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
FROM @d
),
d2(dtr, maxt, mint) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d
SET maxt = d2.maxt, mint = d2.maxt
FROM d
INNER JOIN d2
ON d.dtr >= d2.dtr
AND d.dtr < DATEADD(DAY, 1, d2.dtr);
Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:
USE [tempdb];
GO
CREATE TABLE dbo.d
(
[Date] SMALLDATETIME,
MaxTemp DECIMAL(6,3),
MinTemp DECIMAL(6,3),
RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
);
CREATE INDEX rd ON dbo.d(RoundedDate);
INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
('2011-09-18 12:05:00',38.15,38.099),
('2011-09-18 12:10:00',38.20,38.10),
('2011-09-18 12:15:00',38.22,38.17),
('2011-09-18 12:20:00',38.21,38.20),
('2011-09-19 11:50:00',38.17,38.10),
('2011-09-19 12:55:00',38.32,38.27),
('2011-09-19 12:00:00',38.30,38.20);
Then your update is this simple, and the plan is much nicer:
;WITH g(RoundedDate,MaxTemp)
AS
(
SELECT RoundedDate, MAX(MaxTemp)
FROM dbo.d
GROUP BY RoundedDate
)
UPDATE d
SET MaxTemp = g.MaxTemp
FROM dbo.d AS d
INNER JOIN g
ON d.RoundedDate = g.RoundedDate;
Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE
clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.
Upvotes: 1