bbal20
bbal20

Reputation: 133

Finding Max(Date) BEFORE specified date in Redshift SQL

I have a table (Table A) in SQL (AWS Redshift) where I've isolated my beginning population that contains account id's and dates. I'd like to take the output from that table and LEFT join back to the "accounts" table to ONLY return the start date that precedes or comes directly before the date stored in the table from my output.

Table A (Beg Pop)
-------
select account_id, 
min(start_date),  
min(end_date)
from accounts
group by 1;

I want to return ONLY the date that precedes the date in my current table where account_id match. I'm looking for something like...

Table B
-------
select a.account_id,  
a.start_date,  
a.end_date,
b.start_date_prev,
b.end_date_prev
from accounts as a
left join accounts as b on a.account_id = b.account_id
where max(b.start_date) less than a.start_date;

Ultimately, I want to return everything from table a and only the dates where max(start_date) is less than the start_date from table A. I know aggregation is not allowed in the WHERE clause and I guess I can do a subquery but I only want the Max date BEFORE the dates in my output. Any suggestions are greatly appreciated.

Upvotes: 0

Views: 2048

Answers (2)

Eralper
Eralper

Reputation: 6612

As I understand from the requirement is to display all rows from a base table with the preceeding data sorted based on a column and with some conditions

Please check following example which I took from article Select Next and Previous Rows with Current Row using SQL CTE Expression

WITH CTE as (
  SELECT
    ROW_NUMBER() OVER (PARTITION BY account_id ORDER BY start_date) as RN,
    *
  FROM accounts
)
SELECT
  PreviousRow.*,
  CurrentRow.*,
  NextRow.*
FROM CTE as CurrentRow
LEFT JOIN CTE as PreviousRow ON
  PreviousRow.RN = CurrentRow.RN - 1 and PreviousRow.account_id = CurrentRow.account_id
LEFT JOIN CTE as NextRow ON
  NextRow.RN = CurrentRow.RN + 1 and NextRow.account_id = CurrentRow.account_id
ORDER BY CurrentRow.account_id, CurrentRow.start_date;

I tested with following sample data and it seems to be working

create table accounts(account_id int, start_date date, end_date date);

insert into accounts values (1,'20201001','20201003');
insert into accounts values (1,'20201002','20201005');
insert into accounts values (1,'20201007','20201008');
insert into accounts values (1,'20201011','20201013');
insert into accounts values (2,'20201001','20201002');
insert into accounts values (2,'20201015','20201016');

Output is as follows

enter image description here

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269633

I want to return ONLY the date that precedes the date in my current table where account_id match

If you want the previous date for a given row, use lag():

select a.*,
       lag(start_date) over (partition by account_id order by start_date) as prev_start_date
from accounts a;

Upvotes: 1

Related Questions