Reputation: 211
I have a table that contains the status history (including current status) of a number of resources. It looks something like this:
CREATE TABLE [dbo].[RESOURCE_STATUS](
[id] [int] IDENTITY(1,1) NOT NULL,
[resource_id] [int] NOT NULL,
[date_timestamp] [datetime] NOT NULL,
[end_timestamp] [datetime] NULL,
...
)
And an example of the data ( obviously represents a valid timestamp):
+------+-------------+-----------------+---------------+
| id | resource_id | start_timestamp | end_timestamp |
+------+-------------+-----------------+---------------+
| 1 | 1 | <valid_ts> | <valid_ts> |
| 2 | 2 | <valid_ts> | <valid_ts> |
| 3 | 3 | <valid_ts> | NULL |
| 4 | 1 | <valid_ts> | NULL |
| 5 | 2 | <valid_ts> | NULL |
| 6 | 1 | <valid_ts> | <valid_ts> |
| 7 | 2 | <valid_ts> | NULL |
| 8 | 1 | <valid_ts> | NULL |
+------+-------------+-----------------+---------------+
There are, of course, additional columns representing the status etc but I don't think they are relevant at this point.
In theory, the start_timestamp and end_timestamp in each record are supposed to indicate the date and time of the start and end of each status with a NULL end_timestamp indicating that the status is ongoing (in this case rows 3, 7 and 8 indicate ongoing statuses).
The problem I have is that in some cases (rows 4 and 5 in the example) the end_timestamp hasn't been set and to get our reporting system working properly I need to go back and set that timestamp from the start_timestamp in the next record for that resource in the set if it exists. I.E. update row 4 from row 6 and row 5 from row 7. Rows 3, 7 and 8 shouldn't be modified since they represent the current state of the resource.
Note that the row missing the end_timestamp may not be the first row for that resource_id and there may be multiple rows for one or more resources that have incorrectly NULL end_timestamps.
I need to do this both for the existing data and on an ongoing basis when data is added to the table (I know the creator of the data should be fixed, but for various reasons that isn't on the table at this point).
In case it's relevant, we're using MS SQL Server 2008 and the table currently contains just over two million rows and obviously is growing on a daily basis.
Can anyone help me out with this please?
Upvotes: 0
Views: 39
Reputation: 896
Try this...
WITH AddRowNumber AS
(
SELECT
id
, resource_id
, start_timestamp
, end_timestamp
, ROW_NUMBER() OVER(PARTITION BY resource_id ORDER BY start_timestamp) AS RowNumber
FROM
@Resource_Status
)
, NewTimestamp AS
(
SELECT
A1.id
, A1.resource_id
, A1.start_timestamp
, A1.end_timestamp
, A2.start_timestamp AS NewEndTimeStamp
FROM
AddRowNumber AS A1
INNER JOIN
AddRowNumber AS A2
ON
A1.resource_id = A2.resource_id
AND A1.RowNumber = A2.RowNumber - 1
WHERE
A1.end_timestamp IS NULL
)
UPDATE
@Resource_Status
SET
end_timestamp = Nt.NewEndTimeStamp
FROM
@Resource_Status AS R
INNER JOIN
NewTimestamp AS NT
ON
R.id = NT.id
Let me know if that works.
Ash
Upvotes: 1