Mike Y
Mike Y

Reputation: 11

Using LAG function to obtain value that is not necessarily in the previous row

I have a question about the extent of the functionality of LAG. In the specific example I am dealing with I have a table similar to the following, although this is very simplified:

ID          Name             ItemID      VisitDate
----------- ---------------- ----------- ----------
316         Name,Test        9           2016-11-01
316         Name,Test        12          2016-11-01
316         Name,Test        89          2018-03-09
316         Name,Test        10          2018-03-09
316         Name,Test        1           2018-03-09

My goal was to use LAG to have an additional column that showed the Most Recent VisitDate before the current one. So what I would like to see is

ID          Name             ItemID      VisitDate  LagDate
----------- ---------------- ----------- ---------- ----------
316         Name,Test        9           2016-11-01 NULL
316         Name,Test        12          2016-11-01 NULL
316         Name,Test        89          2018-03-09 2016-11-01
316         Name,Test        10          2018-03-09 2016-11-01
316         Name,Test        1           2018-03-09 2016-11-01

However, I have not been able to find a way to get the table to show up this way. Initially I wrote the following code

SELECT
    TT.ID
   ,TT.Name
   ,TT.ItemID
   ,TT.VisitDate
   ,LAG(TT.VisitDate) OVER ( PARTITION BY TT.ID ORDER BY TT.VisitDate ) AS LagDate
FROM
    @TestTable AS TT;

However, that returned the following table:

ID          Name             ItemID      VisitDate  LagDate
----------- ---------------- ----------- ---------- ----------
316         Name,Test        9           2016-11-01 NULL
316         Name,Test        12          2016-11-01 2016-11-01
316         Name,Test        89          2018-03-09 2016-11-01
316         Name,Test        10          2018-03-09 2018-03-09
316         Name,Test        1           2018-03-09 2018-03-09

That's when I realized that the multiple ItemID values on a single date were causing an issue for me. So does anyone know of a way I can still use LAG on this table, but get the proper LagDate I am looking for on each row?

Upvotes: 1

Views: 3962

Answers (4)

hkravitz
hkravitz

Reputation: 1385

There is another parameter in LEAD/LAG functions and it is the offset. if not used then the default is 1.

LEAD ( scalar_expression [ ,offset ] , [ default ] )   
OVER ( [ partition_by_clause ] order_by_clause )

In your case you need to find the offset of how many rows to "go back" and you can do that by using an intermediate window calculation.

        ;WITH T (ID, Name, ItemID,VisitDate) AS
        (
        SELECT 316, 'Name,Test', 9, CAST('2016-11-01' AS DATE)
        UNION ALL
        SELECT 316, 'Name,Test', 12, CAST('2016-11-01' AS DATE)
        UNION ALL
        SELECT 316, 'Name,Test', 89, CAST('2018-03-09' AS DATE)
        UNION ALL
        SELECT 316, 'Name,Test', 10, CAST('2018-03-09' AS DATE)
        UNION ALL
        SELECT 316, 'Name,Test', 1, CAST('2018-03-09' AS DATE)
        )
    , [Intermediate] as 
       (
        SELECT * , COUNT(Id) OVER (PARTITION BY Id , VisitDate ORDER BY Id) LagVal
        FROM T
       ) 
    ,[Intermediate1] as
       (
        SELECT * , ISNULL(LAG(LagVal) OVER (PARTITION BY Id order by Id), LagVal) LagValFixed
        FROM [Intermediate]
       ) 
        SELECT * , LAG(VisitDate, LagValFixed) OVER (PARTITION BY Id ORDER BY VisitDate) PriorVisit
        FROM [Intermediate1]

As you can see the LagValFixed is being used in the last statement as the parameter to the offset value that the LAG function accepts.

Upvotes: 0

paparazzo
paparazzo

Reputation: 45106

I think this will do it

declare @t table(ID int, Name varchar(10), ItemID int,  VisitDate date);
insert into @t values 
       (316,         'Name,Test',        9,          '2016-11-01')
     , (316,         'Name,Test',       12,          '2016-11-01')
     , (316,         'Name,Test',       89,          '2018-03-09')
     , (316,         'Name,Test',       10,          '2018-03-09')
     , (316,         'Name,Test',        1,          '2018-03-09');
select *  
     , (select max(VisitDate) from @t td where td.VisitDate < t.VisitDate) as dd
from @t t 
order by t.VisitDate;

Upvotes: 0

KeithL
KeithL

Reputation: 5594

I got it using dense rank:

declare @t table (id int, _name varchar(1000),ItemID int, visitDte date)

insert into @t
values
(316,         'Name,Test',        9           ,'2016-11-01')
,(316,         'Name,Test',        12          ,'2016-11-01')
,(316,         'Name,Test' ,      89          ,'2018-03-09')
,(316,         'Name,Test',        10          ,'2018-03-09')
,(316,         'Name,Test',        1           ,'2018-03-09')

;with cte as
(
select *, dr= dense_rank() over (partition by ID order by visitDte) from @t
)
, CleanUp as (
select distinct ID, visitDte, dr
from cte)

select t.*
    ,Cleanup.visitDte PriorVisit
from cte t
left join CleanUp on t.id=CleanUp.id and t.dr-1=Cleanup.dr

Results:

id  _name   ItemID  visitDte    dr  PriorVisit
316 Name,Test   9   2016-11-01  1   NULL
316 Name,Test   12  2016-11-01  1   NULL
316 Name,Test   89  2018-03-09  2   2016-11-01
316 Name,Test   10  2018-03-09  2   2016-11-01
316 Name,Test   1   2018-03-09  2   2016-11-01

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72205

I can see no trivial way of solving this using window functions. The simplest thing I can think of is to use OUTER APPLY:

SELECT  TT.ID, TT.Name, TT.ItemID, TT.VisitDate, X.VisitDate
FROM mytable AS TT  
OUTER APPLY (
  SELECT TOP 1 VisitDate
  FROM mytable AS T
  WHERE T.ID = TT.ID AND T.VisitDate < TT.VisitDate
  ORDER BY T.VisitDate DESC) AS X

Demo here

Upvotes: 1

Related Questions