PowderSnorkel
PowderSnorkel

Reputation: 87

TSQL OVER (PARTITION BY ... )

For my next trick, I would like to select only the most recent event for each client. Instead of four events for 000017 I want one.

OK  c_id    e_date  e_ser   e_att   e_recip Age c_cm    e_staff rn
--> 000017  2013-04-02 00:00:00.000 122 1   1   36  90510   90510   15
--> 000017  2013-02-26 00:00:00.000 122 1   1   36  90510   90510   20
--> 000017  2013-02-12 00:00:00.000 122 1   1   36  90510   90510   24
--> 000017  2013-01-29 00:00:00.000 122 1   1   36  90510   90510   27
--> 000188  2012-11-02 00:00:00.000 160 1   1   31  1289    1289    44
--> 001713  2013-10-01 00:00:00.000 142 1   1   26  2539    2539    1
--> 002531  2013-07-12 00:00:00.000 190 1   1   61  1689    1689    21
--> 002531  2013-06-14 00:00:00.000 190 1   1   61  1689    1689    30
--> 002531  2013-06-07 00:00:00.000 190 1   1   61  1689    1689    31
--> 002531  2013-05-28 00:00:00.000 122 1   1   61  1689    1689    33

Here is the query that got me to this stage (perhaps you have some suggestions to improve this as well, the extra nested query creating t2 table is probably excessive.) Thank you all!!!

SELECT TOP(10)*
FROM (

  SELECT *
  FROM (

    SELECT (SELECT CASE WHEN 
     (e_att IN (1,2)
     AND e_date > DATEADD(month, -12, getdate())
     AND e_ser NOT IN (100,115)
     AND e_recip NOT IN ('2','7')
     AND (( (e_recip = '3') AND (DATEDIFF(Year, c_bd, GetDate())>10) ) OR (e_recip <> '3') )
     AND c_cm = e_staff)
     THEN '-->'
     WHEN 1=1 THEN ''
     END
     ) AS 'OK'
     ,c_id, e_date, e_ser, e_att, e_recip, DATEDIFF(Year, c_bd, GetDate()) AS 'Age', c_cm, e_staff
     ,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn             
    FROM events INNER JOIN client ON e_case_no = c_id
    LEFT OUTER JOIN doc ON doc.doc_dbid = client.c_id
    WHERE client.c_id IN ( /* confidential query */ )
    AND e_date > DATEADD(month, -12, getdate())
    AND e_ser BETWEEN 11 AND 1000
    GROUP BY        c_id, e_date, e_ser, e_att, e_recip, c_bd, c_cm, e_staff
    ) t1
  ) t2
WHERE           OK = '-->'
ORDER BY        c_id, e_date DESC

Upvotes: 1

Views: 15416

Answers (2)

Hogan
Hogan

Reputation: 70523

Here is some improvements to your original query:

SELECT TOP(10) *
FROM (

    SELECT '-->' AS 'OK' -- always this see where.
     ,c_id, e_date, e_ser, e_att, e_recip, DATEDIFF(Year, c_bd, GetDate()) AS 'Age', c_cm, e_staff
     ,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn             
    FROM events INNER JOIN client ON e_case_no = c_id
    LEFT OUTER JOIN doc ON doc.doc_dbid = client.c_id
    WHERE client.c_id IN ( /* confidential query */ )
          -- this part was in case and then filtered for later, if we put it in where now more efficient 
          (e_att IN (1,2)  AND e_date > DATEADD(month, -12, getdate())
            AND e_ser NOT IN (100,115)
            AND (( (e_recip = '3') AND DATEDIFF(Year, c_bd, GetDate()>10) ) OR e_recip NOT IN ('2', '3', '7') )
            AND c_cm = e_staff)


    AND e_date > DATEADD(month, -12, getdate())
    AND e_ser BETWEEN 11 AND 1000
    GROUP BY        c_id, e_date, e_ser, e_att, e_recip, c_bd, c_cm, e_staff
  ) t2
ORDER BY        c_id, e_date DESC

Besides removing some un-needed parens, if you move the stuff from the CASE statement to a where you don't need to filter on it in the outer query and this makes it simpler.

Add in the row_number statement from McGarnagle's answer and you should get the results you want.

Upvotes: 0

McGarnagle
McGarnagle

Reputation: 102743

It looks like the following produces the row number, sorted by date, per client:

,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn             

So adding where rn=1 should yield the most recent event per client:

  ) t1
  WHERE rn = 1
) t2

Upvotes: 3

Related Questions