Reputation: 7259
I have a table of play data that I'm using for a prototype. I'm generating the data while I'm at work, but when I leave and my machine goes to sleep, the data generation stops. This has cause large gaps in my collection of items.
I would like to be able to shift the values of each item in the DateTimeCreated
collumn of my table so that there isn't a gap of more than 10 minutes between any item and the next generated item.
The structure of the table is like this:
CREATE TABLE [dbo].[Items](
[Id] [uniqueidentifier] NOT NULL,
[DateTimeCreated] [datetimeoffset](7) NOT NULL,
[AuthorId] [uniqueidentifier] NOT NULL,
[Source] [varchar](max) NOT NULL,
[FullText] [varchar](max) NOT NULL,
CONSTRAINT [PK_Items] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
I was thinking about doing this in L2S, but I have over 1 million records, so IDK if that is the best solution (iterating over each item). I know there has to be some way to do this in SQL that will be much faster.
Upvotes: 3
Views: 293
Reputation: 4998
An alternative Ranking-Functions Approach (not 100% tested):
DECLARE @tenMinutes AS INT = 600;
WITH StartingPoints AS
(
SELECT DateTimeCreated, ROW_NUMBER() OVER(ORDER BY DateTimeCreated) AS rownum
FROM dbo.Items AS A
WHERE NOT EXISTS(
SELECT * FROM dbo.Items AS B
WHERE B.DateTimeCreated < A.DateTimeCreated
AND DATEDIFF(SECOND,B.DateTimeCreated, A.DateTimeCreated) BETWEEN 0 AND @tenMinutes
)
),
EndingPoints AS
(
SELECT DateTimeCreated, ROW_NUMBER() OVER(ORDER BY DateTimeCreated) AS rownum
FROM dbo.Items AS A
WHERE NOT EXISTS(
SELECT * FROM dbo.Items AS B
WHERE A.DateTimeCreated < B.DateTimeCreated
AND DATEDIFF(SECOND,A.DateTimeCreated, B.DateTimeCreated) BETWEEN 0 AND @tenMinutes
)
),
Islands AS
(
SELECT S.DateTimeCreated AS start_range,
E.DateTimeCreated AS end_range,
ROW_NUMBER() OVER(ORDER BY S.DateTimeCreated) AS row_num
FROM StartingPoints AS S
JOIN EndingPoints AS E on E.rownum = S.rownum
),
Ofs AS
(
SELECT I2.start_range,
I2.end_range,
I1.end_range AS prev,
DATEDIFF(SECOND, I1.end_range, I2.start_range) AS offset
FROM Islands AS I1
JOIN Islands AS I2 ON I2.row_num = I1.row_num + 1 OR I2.row_num IS NULL
),
CmlOfs AS
(
SELECT O1.start_range,
O1.end_range,
O1.prev,
O1.offset,
(SELECT SUM(O2.offset) FROM Ofs AS O2
WHERE O2.start_range <= O1.start_range) AS cum_offset
FROM Ofs AS O1
),
UpdateQ AS
(
SELECT Items.*, DATEADD(SECOND, -1 * CmlOfs.cum_offset, Items.DateTimeCreated) AS new_value
FROM Items
JOIN CmlOfs ON Items.DateTimeCreated BETWEEN CmlOfs.start_range AND CmlOfs.end_range
)
UPDATE UpdateQ
SET DateTimeCreated = new_value;
Upvotes: 1
Reputation: 86715
Make sure to have an index on DateTimeCreated
if you want this to be anything other than a pig.
It also assumes (as you said in your comment) there are few gaps compared to total number of records.
WITH
gap (Start,Finish)
AS
(
SELECT
DateTimeCreated,
(SELECT MIN(DateTimeCreated) FROM items AS lookup WHERE DateTimeCreated > DateTimeCreated)
FROM
items
WHERE
DATEADD(second, 600, DateTimeCreated) < (SELECT MIN(DateTimeCreated) FROM items AS lookup WHERE DateTimeCreated > DateTimeCreated)
UNION ALL
SELECT
MAX(DateTimeCreated),
MAX(DateTimeCreated)
FROM
items
)
,
offset (Start,Finish,Offset)
AS
(
SELECT
[current].Start,
(SELECT MIN(Start) FROM gap WHERE Start > [current].Start),
DATEDIFF(second, Start, Finish) - 600
FROM
gap AS [current]
)
,
cumulative_offset (Start,Finish,Offset)
AS
(
SELECT
[current].Start,
[current].Finish,
SUM([cumulative].Offset)
FROM
offset AS [current]
INNER JOIN
offset AS [cumulative]
ON [cumulative].Start <= [current].Start
)
UPDATE
items
FROM
cumulative_offset
SET
DateTimeCreated = DATEADD(second, -Offset, DateTimeCreated)
INNER JOIN
items
ON items.DateTimeCreated > cumulative.Start
AND items.DateTimeCreated <= cumulative.Finish
Upvotes: 1