Russell Steen
Russell Steen

Reputation: 6612

SQL Server, find an arbitrary sequence of values

Let's assume we have a table Maintenance

Customer LastLogin ActionType
1        12/1/2007 2
1        12/2/2007 2
etc.

We want a list of all customers who at any point during a given year had one or more uninterrupted sequences, 14 days long, of login with action type 2.

I can of course easily do this with code, and even have it be fairly quick over small sets. Is there a non-cursor way to do it in SQL?

Upvotes: 2

Views: 739

Answers (5)

Quassnoi
Quassnoi

Reputation: 425391

This will select all customers with at least two consecutive actions of the same type.

WITH    rows AS 
        (
        SELECT  customer, action,
                ROW_NUMBER() OVER (PARTITION BY customer ORDER BY lastlogin) AS rn
        FROM    mytable
        )
SELECT  DISTINCT customer
FROM    rows rp
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    rows rl
        WHERE   rl.customer = rp.customer
                AND rl.rn = rp.rn + 1
                AND rl.action = rp.action
        )

Here's the more efficient query for just action 2:

WITH    rows AS 
        (
        SELECT  customer, ROW_NUMBER() OVER (PARTITION BY customer ORDER BY lastlogin) AS rn
        FROM    mytable
        WHERE   action = 2
        )
SELECT  DISTINCT customer
FROM    rows rp
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    rows rl
        WHERE   rl.customer = rp.customer
                AND rl.rn = rp.rn + 1
        )

Update 2:

To select uninterrupted ranges:

WITH    rows AS 
        (
        SELECT  customer, action, lastlogin
                ROW_NUMBER() OVER (PARTITION BY customer ORDER BY lastlogin) AS rn
                ROW_NUMBER() OVER (PARTITION BY customer, action ORDER BY lastlogin) AS series
        FROM    mytable
        )
SELECT  DISTINCT customer
FROM    (
        SELECT  customer
        FROM    rows rp
        WHERE   action
        GROUP BY
                customer, actioncode, series - rn
        HAVING
                DETEDIFF(day, MIN(lastlogin), MAX(lastlogin)) >= 14
        ) q

This query calculates two series: one returns contiguous ORDER BY lastlogin, the second one partitions by action additionally:

action  logindate rn  series diff = rn - series
1       Jan 01    1   1      0
1       Jan 02    2   2      0
2       Jan 03    3   1      2
2       Jan 04    4   2      2
1       Jan 05    5   3      2
1       Jan 06    6   4      2

As long as the difference between the two schemes is the same, the series are uninterrupted. Each interruption breaks the series.

This means that the combination of (action, diff) defines the uninterrupted groups.

We can group by action, diff, find MAX and MIN within the groups and filter on them.

If you need to select 14 rows rather than 14 consecutive days, just filter on COUNT(*) instead of the DATEDIFF.

Upvotes: 5

OMG Ponies
OMG Ponies

Reputation: 332581

Use:

WITH dates AS (
  SELECT CAST('2007-01-01' AS DATETIME) 'date'
  UNION ALL
   SELECT DATEADD(dd, 1, t.date) 
     FROM dates t
    WHERE DATEADD(dd, 1, t.date) <= GETDATE())
   SELECT m.customer, 
          m.actiontype
     FROM dates d
LEFT JOIN MAINTENANCE m ON m.last_login = d.date
    WHERE m.last_login IS NULL

Upvotes: 1

Tom H
Tom H

Reputation: 47464

I'm going to assume that by a sequence you mean two or more rows with sequential datetime values with no other rows in between for the same user with a different action type. That being the case, this should give you what you're looking to get:

SELECT DISTINCT
     T1.customer
FROM
     Maintenance T1
INNER JOIN Maintenance T2 ON
     T2.customer = T1.customer AND
     T2.action_type = 2 AND
     T2.last_login > T1.last_login
LEFT OUTER JOIN Maintenance T3 ON
     T3.customer = T1.customer AND
     T3.last_login > T1.last_login AND
     T3.last_login < T2.last_login AND
     T3.action_type <> 2
WHERE
     T1.actiontype = 2 AND
     T3.customer IS NULL

The SQL does just what I said above - finds a row (T1) with another row after it (T2) both with action_type = 2 where there is no row in between (T3) with a different action type. The T3.customer IS NULL checks for NULL because if the column is NULL (I'm assuming that it's a NOT NULL column) then it means that the LEFT OUTER JOIN must not have found a row that matches the criteria.

Upvotes: 0

Raj More
Raj More

Reputation: 48016

EDIT: This would have worked for the original question about two in a row. 14 in a row is a different answer

First you need a sequence so you are going to use ROWNUMBER

You can do a SELF-JOIN Maintenace to itself using ROWNUMBER = ROWNUMBER + 1

Any two consequitive rows with the same customer id, and both rows with "2" ActionType are will give you the list CUSTOMER as your answer.

Try this

WITH Maintenance AS
(
SELECT 1 as Customer, CONVERT (DateTime, '1/1/2008') DateTimeStamp, 1 ActionType
UNION
SELECT 1, '3/1/2009', 1
UNION
SELECT 1, '3/1/2006', 2
UNION
SELECT 2, '3/1/2009', 1
UNION
SELECT 2, '3/1/2006', 2
)
,RowNumberMaintenance AS
(SELECT  ROW_NUMBER () OVER (ORDER BY Customer, DateTimeStamp)  AS RowNumber, *
FROM Maintenance)
SELECT m1.Customer
From RowNumberMaintenance M1
    INNER JOIN RowNumberMaintenance M2
        ON M1.Customer = M2.Customer
        AND M1.RowNumber = M2.RowNumber + 1
WHERE 1=1
        AND M1.ActionType <> 2
        AND M2.ActionType <> 2

Upvotes: 1

Jason
Jason

Reputation: 17956

select customerID, count(customerID)
from maintenance
where actiontype = 2
group by customerID
having count(customerID) >= 1

Upvotes: 0

Related Questions