RIanGillis
RIanGillis

Reputation: 619

In T-SQL How Can I Select Up To The 5 Most Recent Rows, Grouped By An Identifier, If They Contain A Specific Value?

Long title.
I am using T-SQL and attempting to find all accounts who's most recent transactions are ACHFAIL, and determine how many in a row they have had, up to 5.
I already wrote a huge, insanely convoluted query to group and count all ACHFAIL transactions that have had x ACHFAILs in a row. Now the requirements are the simpler "only count the most recent transactions"

Below is what I have so far, but I cannot wrap my head around the next step to take. I was trying to simplify my task by only counting up the 5, but if I could provide an accurate count of all the ACHFAIL attempts in a row, that would more ideal.

WITH grouped 
AS (
SELECT 
ROW_NUMBER() OVER (PARTITION BY TRANSACTIONS.deal_id 
                         ORDER BY TRANSACTIONS.deal_id, tran_date DESC) AS row_num
,TRANSACTIONS.tran_code
,TRANSACTIONS.tran_date
,TRANSACTIONS.deal_id
FROM TRANSACTIONS
)
SELECT TOP 1000 * FROM grouped

which returns rows such as:

row_num tran_code   tran_date   deal_id
1   ACHFAIL 2014-08-05 09:20:38.000 {01xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2   ACHCLEAR    2014-08-04 16:27:17.473 {01xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx} 
1   ACHCLEAR    2014-09-09 15:14:48.337 {02xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2   ACHCLEAR    2014-09-08 14:23:00.737 {02xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
1   ACHFAIL 2014-07-18 14:35:38.037 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2   ACHFAIL 2014-07-18 13:58:52.000 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
3   ACHCLEAR    2014-07-17 14:48:58.617 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
4   ACHFAIL 2014-07-16 15:04:28.023 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}

01xxxxxx has 1 ACHFAIL
02xxxxxx has 0 ACHFAIL
03xxxxxx has 2 ACHFAIL

Upvotes: 0

Views: 78

Answers (2)

Code Different
Code Different

Reputation: 93161

You are half way there. With any sort of problem with "consecutive rows", you will need a recursive CTE (that's TEMP2 below):

;WITH
    TEMP1 AS
    (
        SELECT  tran_code,
                deal_id,
                ROW_NUMBER() OVER (PARTITION BY deal_id ORDER BY tran_date DESC) AS tran_rank
        FROM    TRANSACTIONS
    ),
    TEMP2 AS
    (
        SELECT      tran_code,
                    deal_id,
                    tran_rank
        FROM        TEMP1
        WHERE       tran_rank = 1         -- last transaction for a deal
                AND tran_code = 'ACHFAIL' -- failed transactions only
        UNION ALL
        SELECT      curr.tran_code,
                    curr.deal_id,
                    curr.tran_rank
        FROM        TEMP1   curr
        INNER JOIN  TEMP2   prev ON curr.deal_id   = prev.deal_id           -- transaction must be for the same deal
                                AND curr.tran_rank = prev.tran_rank + 1     -- must be consecutive
        WHERE       curr.tran_code = 'ACHFAIL'                              -- must have failed
                AND curr.tran_rank <= 5                                     -- up to 5 only
    )

SELECT      t.deal_id,
            ISNULL(MAX(tran_rank),0)   AS FailCount
FROM        TRANSACTIONS  t
LEFT  JOIN  TEMP2         t2 ON t.deal_id = t2.deal_id
GROUP BY    t.deal_id

SQL Fiddle

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269933

If I understand correctly, you want the number of fails in the five most recent transactions for each deal. That would be something like:

WITH grouped AS (
      SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY t.deal_id ORDER BY tran_date DESC
                               ) AS seqnum
      FROM TRANSACTIONS t
     )
SELECT deal_id, sum(case when tran_code = 'ACHFAIL' then 1 else 0 end) as NuMFails
FROM grouped
WHERE seqnum <= 5
GROUP BY deal_id;

The CTE enumerates the rows. The where clause takes the 5 most recent rows for each deal. The group by then aggregates by deal_id.

Note that you do not need to include the partition by column(s) in the order by when you use over.

Upvotes: 0

Related Questions