Alex Ramirez
Alex Ramirez

Reputation: 127

Conditionally use today's date or report date for the last row of each group in SQL

I have a table with of reported transactions in Redshift/PostgreSQL where from time to time I upload a report with the invoices and their due and paid dates like this:

invoice_Id report_date due_date paid_date
A1 08-26-2021 09-02-2021 08-25-2021
A2 08-26-2021 08-29-2021
A3 08-26-2021 09-05-2021
A4 08-26-2021 08-30-2021
A1 08-28-2021 09-02-2021 08-25-2021
A2 08-28-2021 08-29-2021
A3 08-28-2021 09-05-2021 08-27-2021
A4 08-28-2021 08-30-2021
A1 08-30-2021 09-02-2021 08-25-2021
A2 08-30-2021 08-29-2021
A3 08-30-2021 09-05-2021 08-27-2021
A4 08-30-2021 08-30-2021

I need to calculate the status of the invoices based on paid_date, due_date and (here is the tricky part) report_date or TODAY when the report is the last one of each invoice.

invoice_Id report_date due_date paid_date status
A1 08-26-2021 09-02-2021 08-25-2021 PAID
A2 08-26-2021 08-29-2021 PENDING
A3 08-26-2021 09-05-2021 PENDING
A4 08-26-2021 08-30-2021 PENDING
A1 08-28-2021 09-02-2021 08-25-2021 PAID
A2 08-28-2021 08-29-2021 PENDING
A3 08-28-2021 09-05-2021 08-27-2021 PAID
A4 08-28-2021 08-30-2021 PENDING
A1 08-30-2021 09-02-2021 08-25-2021 PAID
A2 08-30-2021 08-29-2021 PAST_DUE
A3 08-30-2021 09-05-2021 08-27-2021 PAID
A4 08-30-2021 08-30-2021 PENDING

Assuming TODAY is 08-31-2021:

invoice_Id report_date due_date paid_date status
A1 08-30-2021 09-02-2021 08-25-2021 PAID
A2 08-30-2021 08-29-2021 PAST_DUE
A3 08-30-2021 09-05-2021 08-27-2021 PAID
A4 08-30-2021 08-30-2021 PAST_DUE

As you can see:

I am trying something like:

SELECT report_date, due_date, paid_date,
  CASE
   WHEN paid_date IS NOT NULL THEN 'PAID'
   WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
   WHEN report_date <= due_date THEN 'PENDING'
  END as status
FROM reported_transaction

But I need to make sure that for the last row of each invoice reported transaction it uses today's date in the transaction. So instead of:

   ...
   WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
   WHEN report_date <= due_date THEN 'PENDING'
   ...

it uses:

   ...
   WHEN get_date() > due_date THEN 'PAST_DUE'
   WHEN get_date() <= due_date THEN 'PENDING'
   ...

Upvotes: 0

Views: 48

Answers (1)

trillion
trillion

Reputation: 1401

Can you try this ?

  • First step is to create rank based on date.

  • Then get the maximum rank and then compare the maximum rank per id with each rank column that we created in first CTE

  • If the rank is same as of max rank then we do your current data condition else the normal conditons


    WITH ranked_date as (
    
    select 
    *,
    row_number() over(partition by invoice_id order by report_date) as ranked_date
    
    from [table name]
    ),
    
    max_rank as (
    select
    invoice_id,
    max(ranked_date) as max_ranked_date
    from ranked_date
    group by 1
    )
    
    select 
    ranked_date.invoice_id,
    ranked_date.ranked_date,
    CASE 
         WHEN ranked_date.ranked_date = max_rank.max_ranked_date THEN
             CASE
                 WHEN get_date() > due_date THEN 'PAST_DUE'
                 WHEN get_date() <= due_date THEN 'PENDING'
                 END
         WHEN ranked_date.ranked_date != max_rank.max_ranked_date THEN
             CASE 
                 WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
                 WHEN report_date <= due_date THEN 'PENDING'
             END
    ELSE 'NO CONDITION SPECIFIED' 
    END AS new_status
    
    
    FROM ranked_date
    INNER JOIN max_rank 
    ON ranked_date.invoice_id = max_rank.invoice_id

Upvotes: 1

Related Questions