chill3chee
chill3chee

Reputation: 69

SQL - Selecting a column value based on max value in another column and combination of values in another column - Teradata

My input Teradata table accnt_pln_info sample data is as below.

Account_number   Plan_code   Plan_Date    Base_Amount     Biz_Date
ACCT1            R           2017-JAN-01         100      2017-MAY-31
ACCT1            R           2017-JAN-11          30      2017-MAY-31
ACCT1            K           2017-JAN-22          80      2017-MAY-31
ACCT1            B           2017-JAN-13          50      2017-MAY-31
ACCT1            C           2017-JAN-18         180      2017-MAY-31
ACCT2            R           2017-JAN-12          70      2017-MAY-31
ACCT2            C           2017-JAN-02          90      2017-MAY-31
ACCT2            R           2017-JAN-08          10      2017-MAY-31
ACCT2            D           2017-JAN-02          40      2017-MAY-31
ACCT2            B           2017-FEB-24          14      2017-MAY-31
ACCT2            K           2017-FEB-12          79      2017-MAY-31

Desired output: (For filter condition Biz_Date=2017-MAY-31

Account_number   RK_Plan_Date    RK_Base_Amount   RC_Plan_Date   RC_Base_Amount
ACCT1            2017-JAN-22          80          2017-JAN-18         180
ACCT2            2017-FEB-12          79          2017-JAN-12          70    

Logic:

Filter condition applied Biz_Date=2017-MAY-31 as table has multiple distinct biz_dates.
Group by Account_Number;  Plan_Date in (R,K), 
find the max Plan_Date and then get that rows Base_Amount; 
Plan_Date in (R,C), find the max Plan_Date and 
then get that rows Base_Amount.

eg: For ACCT1, and plan_code in ('R','K'), the max plan_date values is 2017-JAN-22; Hence need to get Base_amount of that row which is 80

Assumptions:

There can be duplicates on Account_number and Plan_Code.
There will not be duplicates on Account_number, Plan_Code in (R,K) and Plan_Date.
There will not be duplicates on Account_number, Plan_Code in (R,C) and Plan_Date.
The input order in table is not necessarily the same. 

What I have tried and failed:

SELECT ACCOUNT_NUMBER, 
MAX(CASE WHEN PLAN_DATE IN ('R','K') THEN PLAN_DATE END) MAX_RK_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','K') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RK_AMOUNT,
MAX(CASE WHEN PLAN_DATE IN ('R','C') THEN PLAN_DATE END) MAX_RC_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','C') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RC_AMOUNT 
FROM ACCNT_PLN_INFO;

As expected, it has failed as I am nesting aggregate function into a normal case statement. I thought of using the chunks of data by splitting it as

SELECT ....
(SELECT ACCOUNT_NUMBER, 'RK', 
MAX(PLAN_DATE) MAX_RK_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','K') 
UNION 
SELECT ACCOUNT_NUMBER, 'RC', 
MAX(PLAN_DATE) MAX_RC_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','C') )

and wanted to join to outer select from the same table again. But due to the different possible combinations of(R.K) and (R,C) I am not able do that. I know how to achieve it when there are no combinations involved.

For convenience, I had specified only 2 combinations with 2 values as PLAN_DATE IN ('R','K'); PLAN_DATE IN ('R','C'). But actually there are 6 combinations and each combination will have 4 values.

I had tried whatever I could do to achieve this. But unfortunately, couldn't. How to select a column value when we need a multiple combination of values and max of a column value. Thank you for your precious time.

Upvotes: 0

Views: 2196

Answers (2)

dnoeth
dnoeth

Reputation: 60462

You can use an approach which is similar to the aggregation you tried applying a dirty ol' trick, piggybacking.

You combine both columns into a single string, apply MAX and then strip off the date part again, e.g. for ACCT1 combining both PLAN_DATE & BASE_AMOUNT into a single string will result in this:

'20170101        100'
'20170111         30'
'20170113         50'
'20170118        180'
'20170122         80' -- this will be returned by MAX

After applying max you extract both columns again using SUBSTRING:

   CAST(SUBSTR('2017-01-22         80', 1, 10) AS DATE)
   CAST(SUBSTR('2017-01-22         80', 11) AS INT)

Of course you must create a string which is still sorting the correct way, e.g. yyyymmdd for a date and fixed width including leading spaces for numeric.

Now it's some Cut&Paste&Modify:

SELECT ACCOUNT_NUMBER,
   To_Date(Substr(RK, 1,8), 'yyyymmdd') AS MAX_RK_PLAN_DATE,
   Cast(Substring(RK From 9) AS INT) AS REQUIRED_RK_AMOUNT,
   To_Date(Substr(RC, 1,8), 'yyyymmdd') AS MAX_RC_PLAN_DATE,
   Cast(Substring(RC From 9) AS INT) AS REQUIRED_RC_AMOUNT
FROM 
 ( 
   SELECT ACCOUNT_NUMBER, 
      Max(CASE WHEN PLAN_code IN ('R','K') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RK,
      Max(CASE WHEN PLAN_code IN ('R','C') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RC
   FROM ACCNT_PLN_INFO
   WHERE  biz_date = DATE '2017-05-31'
   GROUP BY 1
 ) AS dt

Upvotes: 1

Andrew
Andrew

Reputation: 8703

EDIT: Rewrote using qualify.

You need to get the max plan date for each plan_code pairing. You can do that in two separate derived tables, using qualify to get the data for the max plan date. Then you can join those two results together using account_number.

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from
(
select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','K')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
) rk
inner join 
(select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rc_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','C')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
)RC
on RK.account_number = rc.account_number

Original (non teradata specific syntax):

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from (
    select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
    from 
    ACCNT_PLN_INFO
    inner join (
    select
    account_number,
    max(plan_date) as plan_date
    from
    ACCNT_PLN_INFO
    where
    plan_code in ('R','K')
    group by 1) rk
        on ACCNT_PLN_INFO.account_number = rk.account_number
        and ACCNT_PLN_INFO.plan_date = rk.plan_date
        and ACCNT_PLN_INFO.plan_code in ('R','K')
) RK
inner join (    
select
ACCNT_PLN_INFO.account_number,
ACCNT_PLN_INFO.plan_date as rc_plan_date,
base_amount
from 
ACCNT_PLN_INFO
inner join (
select
account_number,
max(plan_date) as plan_date
from
ACCNT_PLN_INFO
where
plan_code in ('R','C')
group by 1) rc
    on ACCNT_PLN_INFO.account_number = rc.account_number
    and ACCNT_PLN_INFO.plan_date = rc.plan_date
    and ACCNT_PLN_INFO.plan_code in ('C','R')
) RC
on RK.account_number = rc.account_number

Upvotes: 0

Related Questions