Hooligancat
Hooligancat

Reputation: 3798

SELECT multiple rows where date Is greater than X minutes of previous row

I have a need to SELECT all the rows from a table where the selected rows are greater than the datetime of the previously selected row by a given constant number of minutes. An example probably speaks best.

The following represents the table of data - we will call it myTable.

guid     fkGuid   myDate
-------  -------  ---------------------
1        100      2013-01-10 11:00:00.0
2        100      2013-01-10 11:05:00.0
3        100      2013-01-10 11:10:00.0
4        100      2013-01-10 11:15:00.0
5        100      2013-01-10 11:20:00.0
6        100      2013-01-10 11:25:00.0
7        100      2013-01-10 11:30:00.0
8        100      2013-01-10 11:35:00.0
9        100      2013-01-10 11:40:00.0
10       100      2013-01-10 11:50:00.0
11       100      2013-01-10 11:55:00.0

What I want to do is provide a constant increment (say 10 minutes) and get back all the rows from the first that are 10 minutes or more from the previous row. So, with 10 minutes the result set should look like this:

guid     myDate
-------  ---------------------
1        2013-01-10 11:00:00.0
3        2013-01-10 11:10:00.0
5        2013-01-10 11:20:00.0
7        2013-01-10 11:30:00.0
9        2013-01-10 11:40:00.0
11       2013-01-10 11:55:00.0

The constant is passed in as a variable so it could be anything. Let's say it was 23 minutes, then the result set should look like this:

guid     myDate
-------  ---------------------
1        2013-01-10 11:00:00.0
6        2013-01-10 11:25:00.0
10       2013-01-10 11:50:00.0

The last example shows that I start at row 0's time (11:00:00) add 23 minutes and get the next >= row which is 11:25:00, add 23 minutes to the new row's time and then get the next (11:50:00) and so on.

I have tried doing this with a CTE but although I can quite easily get back all my times or none of them, I can't seem to figure how to get the rows I need. My current test code using 23 minutes hard coded into the WHERE clause:

WITH myCTE AS
(
    SELECT guid,
           myDate,
           ROW_NUMBER() OVER (PARTITION BY guid ORDER BY myDate ASC) AS rowNum
    FROM myTable
    WHERE fkGuid = 100
)

SELECT currentRow.guid, currentRow.myDate
FROM myCTE AS currentRow
LEFT OUTER JOIN
    myCTE AS previousRow
    ON currentRow.guid = previousRow.guid
    AND currentRow.rowNum = previousRow.rowNum + 1
WHERE
    currentRow.myDate > DATEADD(minute, 23, previousRow.myDate)
ORDER BY
    currentRow.myDate ASC

This returns nothing. If I omit the WHERE clause I get all rows back (obviously because I'm not filtering).

What am I missing?

Any and all help would be very much appreciated as it always is!

Upvotes: 2

Views: 2658

Answers (2)

gilly3
gilly3

Reputation: 91707

First, your join will never return any rows, regardless of the where clause. Guid and rowNum are both unique keys per row, so if the guid is the same, so will be the rowNum. You can see that the join always fails by adding a field from previousRow to your select list and running your query without the where clause.

Next, joining on rowNum + 1 prevents skipping rows. You will only select adjacent rows that satisfy the date filter.

There may be some SQL voodoo with recursive queries that will make this work, but there will be a huge performance hit. Filter the data in your application code. Eg, in C#:

List<DataRow> FilterByInterval(IEnumerable<DataRow> rows, string dateColumn, int minutes)
{
    List<DataRow> filteredRows = new List<DataRow>();
    DateTime lastDate = DateTime.MinValue;
    foreach (DataRow row in rows)
    {
        DateTime dt = row.Field<DateTime>(dateColumn);
        TimeSpan diff = dt - lastDate;
        if (diff.TotalMinutes >= minutes)
        {
            filteredRows.Add(row);
            lastDate = dt;
        }
    }
    return rows;
}

Upvotes: 0

Dale M
Dale M

Reputation: 2473

@gilly3, hardly SQL voodoo

WITH CTE
AS
(
  SELECT TOP 1
         guid 
        ,fkGuid
        ,myDate
        ,ROW_NUMBER() OVER (ORDER BY myDate) RowNum
  FROM MyTable
  UNION ALL
  SELECT mt.guid
        ,mt.fkGuid
        ,mt.myDate
        ,ROW_NUMBER() OVER (ORDER BY mt.myDate)
  FROM  MyTable mt
        INNER JOIN
        CTE ON mt.myDate>=DATEADD(minute,23,CTE.myDate)
  WHERE RowNum=1
)
SELECT guid
       ,fkGuid
       ,myDate
FROM   CTE
WHERE  RowNum=1

The SQL Fiddle is here

Upvotes: 2

Related Questions