1.618
1.618

Reputation: 227

DATEDIFF() Between dates with different ROW_NUMBER

I have the following query:

SELECT 
   START_DATE
  ,ID
  ,USER
  ,ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN

FROM TABLE

Which brings the following results:

START_DATE   ID   USER   RN
2019-01-01   200  01     1
2019-01-10   450  01     2
2019-01-02   500  02     1

I'd like to just show users who have more than one Start Date (row number 1, 2, 3) and exclude users who just have a row_number of 1. Second, I need to show the DATEDIFF() between each Row_Number.

I thought to include ROW_NUMBER as a way to do this, but didn't know where to go past that. It might need a new solution. A sample of what that would look like is:

START_DATE  USER   datediff
2019-01-01  01     10

Upvotes: 0

Views: 2308

Answers (5)

Error_2646
Error_2646

Reputation: 3791

WITH TMP AS (
    SELECT 
       START_DATE
      ,ID
      ,USER
      ,LAG(START_DATE) OVER ( PARTITION BY USER ORDER BY START_DATE) AS LAST_START_DATE
      ,CASE WHEN MIN(START_DATE ) OVER ( PARTITION BY USER ) = 
                 MAX(START_DATE) OVER ( PARTITION BY USER) THEN 1
            ELSE 0
        END ExcludeIfOnlyNonUnique
      FROM TEST_DATA)
SELECT TMP.START_DATE,
       TMP.ID,
       TMP.USER,
       DATEDIFF(TMP.START_DATE, TMP.LAST_START_DATE) START_DATE_DIFF
  FROM TMP
 WHERE ExcludeIfOnlyNonUnique = 0;

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1269693

You don't want row_number(). You want count():

select t.*
from (select . . . ,
             count(*) over (partition by user) as cnt
      from t
     ) t
where cnt > 1;

You can include row_number() in the result set, of course, if that is still needed.

Upvotes: 0

RThomas
RThomas

Reputation: 10882

There are a number of ways to do this but one of the more elegant would be using the Microsoft SQL Server functions lead and lag. These functions allow you to access values from preceding and following rows in partition over approaches. Read up on them here:

Lead: https://learn.microsoft.com/en-us/sql/t-sql/functions/lead-transact-sql?view=sql-server-2017

Lag: https://learn.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-2017

Upvotes: 1

mkRabbani
mkRabbani

Reputation: 16908

try this below script-

WITH CTE(START_DATE, ID, [USER], RN)
AS
(
    SELECT 
    START_DATE
    ,ID
    ,USER
    ,ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN
)

SELECT MIN(START_DATE) [START_DATE], 
[USER],
DATEDIFF(DD,MIN(START_DATE),MAX(START_DATE))+1 [datediff]
FROM CTE
GROUP BY [USER]
HAVING MAX(RN) > 1

Upvotes: 0

Hogan
Hogan

Reputation: 70523

I would use a CTE and join to the query twice.

WITH MY_QUERY AS (
  SELECT 
    START_DATE, 
    ID, 
    USER,
    ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN
  FROM SOMETABLE
), MY_ID AS (
  SELECT DISTINCT USER FROM MY_QUERY
)
SELECT Q1.START_DATE, Q1.USER, DATEDIFF(Q1.START_DATE, Q2.START_DATE) AS DIFF
FROM MY_ID ID
JOIN MY_QUERY Q1 ON ID.USER = Q1.USER AND Q1.RN = 1
JOIN MY_QUERY Q2 ON ID.USER = Q2.USER AND Q2.RN = 2

Upvotes: 0

Related Questions