Reputation: 127
I have a table with of reported transactions in Redshift/PostgreSQL where from time to time I upload a report with the invoices and their due and paid dates like this:
invoice_Id | report_date | due_date | paid_date |
---|---|---|---|
A1 | 08-26-2021 | 09-02-2021 | 08-25-2021 |
A2 | 08-26-2021 | 08-29-2021 | |
A3 | 08-26-2021 | 09-05-2021 | |
A4 | 08-26-2021 | 08-30-2021 | |
A1 | 08-28-2021 | 09-02-2021 | 08-25-2021 |
A2 | 08-28-2021 | 08-29-2021 | |
A3 | 08-28-2021 | 09-05-2021 | 08-27-2021 |
A4 | 08-28-2021 | 08-30-2021 | |
A1 | 08-30-2021 | 09-02-2021 | 08-25-2021 |
A2 | 08-30-2021 | 08-29-2021 | |
A3 | 08-30-2021 | 09-05-2021 | 08-27-2021 |
A4 | 08-30-2021 | 08-30-2021 |
I need to calculate the status of the invoices based on paid_date, due_date and (here is the tricky part) report_date or TODAY when the report is the last one of each invoice.
invoice_Id | report_date | due_date | paid_date | status |
---|---|---|---|---|
A1 | 08-26-2021 | 09-02-2021 | 08-25-2021 | PAID |
A2 | 08-26-2021 | 08-29-2021 | PENDING | |
A3 | 08-26-2021 | 09-05-2021 | PENDING | |
A4 | 08-26-2021 | 08-30-2021 | PENDING | |
A1 | 08-28-2021 | 09-02-2021 | 08-25-2021 | PAID |
A2 | 08-28-2021 | 08-29-2021 | PENDING | |
A3 | 08-28-2021 | 09-05-2021 | 08-27-2021 | PAID |
A4 | 08-28-2021 | 08-30-2021 | PENDING | |
A1 | 08-30-2021 | 09-02-2021 | 08-25-2021 | PAID |
A2 | 08-30-2021 | 08-29-2021 | PAST_DUE | |
A3 | 08-30-2021 | 09-05-2021 | 08-27-2021 | PAID |
A4 | 08-30-2021 | 08-30-2021 | PENDING |
Assuming TODAY is 08-31-2021:
invoice_Id | report_date | due_date | paid_date | status |
---|---|---|---|---|
A1 | 08-30-2021 | 09-02-2021 | 08-25-2021 | PAID |
A2 | 08-30-2021 | 08-29-2021 | PAST_DUE | |
A3 | 08-30-2021 | 09-05-2021 | 08-27-2021 | PAID |
A4 | 08-30-2021 | 08-30-2021 | PAST_DUE |
As you can see:
I am trying something like:
SELECT report_date, due_date, paid_date,
CASE
WHEN paid_date IS NOT NULL THEN 'PAID'
WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
WHEN report_date <= due_date THEN 'PENDING'
END as status
FROM reported_transaction
But I need to make sure that for the last row of each invoice reported transaction it uses today's date in the transaction. So instead of:
...
WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
WHEN report_date <= due_date THEN 'PENDING'
...
it uses:
...
WHEN get_date() > due_date THEN 'PAST_DUE'
WHEN get_date() <= due_date THEN 'PENDING'
...
Upvotes: 0
Views: 48
Reputation: 1401
Can you try this ?
First step is to create rank based on date.
Then get the maximum rank and then compare the maximum rank per id with each rank column that we created in first CTE
If the rank is same as of max rank then we do your current data condition else the normal conditons
WITH ranked_date as (
select
*,
row_number() over(partition by invoice_id order by report_date) as ranked_date
from [table name]
),
max_rank as (
select
invoice_id,
max(ranked_date) as max_ranked_date
from ranked_date
group by 1
)
select
ranked_date.invoice_id,
ranked_date.ranked_date,
CASE
WHEN ranked_date.ranked_date = max_rank.max_ranked_date THEN
CASE
WHEN get_date() > due_date THEN 'PAST_DUE'
WHEN get_date() <= due_date THEN 'PENDING'
END
WHEN ranked_date.ranked_date != max_rank.max_ranked_date THEN
CASE
WHEN report_date > due_date AND paid_date is null THEN 'PAST_DUE'
WHEN report_date <= due_date THEN 'PENDING'
END
ELSE 'NO CONDITION SPECIFIED'
END AS new_status
FROM ranked_date
INNER JOIN max_rank
ON ranked_date.invoice_id = max_rank.invoice_id
Upvotes: 1