es_analytics
es_analytics

Reputation: 13

How do you segment varying 'policy periods' by group in SQL (to subsequently calculate running total per 'policy period')?

I understand how to calculate a running total in SQL (DB2) using an analytic function. However, I need to START the running total a) at different points (rows) per group & b) restart the running total every 'policy period'. For instance, if I want to calculate the running total cost per member per month per policy period, for the groups that renew on January 1st of each year, this is straight-forward, as I can partition by year & group_id (& order by month/year) in order to restart the running total per policy period (as the policy period is equivalent to a calendar year).

The challenge begins when I have groups whose policies renew in July of each year, for example, & therefore I need to start (& then re-start) the running total amount in July of each year for that subset of groups.

If I can create a column that segments the policy periods per group, then I can partition by group_id, policy period & order by month/year, I believe. For instance, for a certain group that renews in July of each year, the policy_period column indicates '1' from 201207-201306, '2' from 201307-201406, etc. (This essentially represents the number of policy periods I have as training data for my ML mode.)

Can anyone assist with creating the aforementioned column? I am trying to avoid hard-coding this with numerous case statements, but that may be the route I have to take. Any ideas?

As an aside, I was trying to think outside-the-box & thought that I could potentially transform July to be read as January, but this may also be too time-consuming, as groups can renew their policies in any of the 12-months of the year. I also have a binary column that indicates 1 if the row is a renewal month for a given group, else 0, but this does not assist with the partitioning/ordering in an analytic function to calculate the running total.

Example Table of Current Data:

GRP_ID  YR_MO    MONTHLY_COST  NEXT_RENEWAL_YR_MO
001     201201   10            202007       
001     201202   10            202007 
001     201203   10            202007
001     201204   10            202007
001     201205   10            202007       
001     201206   10            202007 
001     201207   10            202007    
001     201208   10            202007
001     201209   10            202007              
001     201210   10            202007             
001     201211   10            202007             
001     201212   10            202007           
001     201301   10            202007 
001     201302   10            202007             
001     201303   10            202007              
001     201304   10            202007        
001     201305   10            202007           
001     201306   10            202007          
001     201307   10            202007          
001     201308   10            202007           

Example Table of Desired Results:

GRP_ID  YR_MO    POLICY_PERIOD  MONTHLY_COST   RUN_TOT_AMT_PER_PLCY_PRD
001     201201   1              10             10
001     201202   1              10             20
001     201203   1              10             30
001     201204   1              10             40
001     201205   1              10             50
001     201206   1              10             60
001     201207   2              10             10
001     201208   2              10             20
001     201209   2              10             30
001     201210   2              10             40
001     201211   2              10             50
001     201212   2              10             60
001     201301   2              10             70
001     201302   2              10             80
001     201303   2              10             90
001     201304   2              10             100
001     201305   2              10             110
001     201306   2              10             120
001     201307   3              10             10
001     201308   3              10             20

The above is just a single example. Obviously, partitioning by group_id is simple, but how can I a) create the policy period column to then b) start the running total amount per policy period per group?

The policy period column would involve using denserank(), but starting & re-starting on the seventh month of each year for certain groups that renew in July of each year, vs. the third month of each year for other groups renewing in March each year, etc. is the challenge here.

I am thinking along the lines of:

CASE
    WHEN RIGHT(NEXT_RENEWAL_YR_MO,2) = RIGHT(YR_MO,2) THEN 1
    ELSE 0
 END AS RESTART_RUN_TOT_HERE

...but I continue to get stuck on how to get the running total to actually RESTART at each subsequent '1' per group when the rest of the rows indicate '0'.

Upvotes: 0

Views: 67

Answers (1)

Mark Barinstein
Mark Barinstein

Reputation: 12454

Try this. If you uncomment the commented out block, you may run the statement as is to get the result you provided.

/*
WITH TAB (GRP_ID, YR_MO, MONTHLY_COST, NEXT_RENEWAL_YR_MO) AS 
(
VALUES 
  ('001', '201202', 10, '202007')
, ('001', '201201', 10, '202007')
, ('001', '201203', 10, '202007')
, ('001', '201204', 10, '202007')
, ('001', '201205', 10, '202007')
, ('001', '201206', 10, '202007')
, ('001', '201207', 10, '202007')
, ('001', '201208', 10, '202007')
, ('001', '201209', 10, '202007')
, ('001', '201210', 10, '202007')
, ('001', '201211', 10, '202007')
, ('001', '201212', 10, '202007')
, ('001', '201301', 10, '202007')
, ('001', '201302', 10, '202007')
, ('001', '201303', 10, '202007')
, ('001', '201304', 10, '202007')
, ('001', '201305', 10, '202007')
, ('001', '201306', 10, '202007')
, ('001', '201307', 10, '202007')
, ('001', '201308', 10, '202007')
)
*/
SELECT 
  GRP_ID, YR_MO
, SUM(CASE WHEN RIGHT(YR_MO, 2) = RIGHT(NEXT_RENEWAL_YR_MO, 2) THEN 1 ELSE 0 END) OVER (PARTITION BY GRP_ID ORDER BY YR_MO) + 1 AS POLICY_PERIOD
, MONTHLY_COST
, SUM(MONTHLY_COST) OVER 
(
  PARTITION BY 
  GRP_ID
, SUM(CASE WHEN RIGHT(YR_MO, 2) = RIGHT(NEXT_RENEWAL_YR_MO, 2) THEN 1 ELSE 0 END) OVER (PARTITION BY GRP_ID ORDER BY YR_MO)
  ORDER BY YR_MO
) AS RUN_TOT_AMT_PER_PLCY_PRD
FROM TAB;

db<>fiddle example.

Upvotes: 1

Related Questions