GhostMan
GhostMan

Reputation: 63

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.

I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.

Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.

I've this table with the following data:

ID    VehicleId CONSUME_DATE_FROM  CONSUM_DATE_TO   GAS_PER_LITER
1      100      2018-10-25         2018-12-01           600         
2      101      2018-07-19         2018-07-24           100 
3      102      2018-12-31         2019-01-01           400
4      103      2018-03-29         2018-05-29           200
5      104      2018-02-05         2018-02-09           50

The expected output table should be as below

ID    VehicleId CONSUME_DATE_FROM  CONSUM_DATE_TO   GAS_PER_LITER
1      100      2018-10-25          2018-12-01      200         
1      100      2018-10-25          2018-12-01      200     
1      100      2018-10-25          2018-12-01      200     
2      101      2018-07-19          2018-07-24      100 
3      102      2018-12-31          2019-01-01      200
3      102      2018-12-31          2019-01-01      200
4      103      2018-03-29          2018-05-29      66.66
4      103      2018-03-29          2018-05-29      66.66
4      103      2018-03-29          2018-05-29      66.66
5      104      2018-02-05          2018-02-09      50

Or as below

ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO  GAS_PER_LITER DATE_RELOAD_GAS        
    1      100      2018-10-25       2018-12-01     200        2018-10-01       
    1      100      2018-10-25       2018-12-01     200        2018-11-01           
    1      100      2018-10-25       2018-12-01     200        2018-12-01           
    2      101      2018-07-19       2018-07-24     100        2018-07-01
    3      102      2018-12-31       2019-01-01     200        2018-12-01
    3      102      2018-12-31       2019-01-01     200        2019-01-01
    4      103      2018-03-29       2018-05-29     66.66      2018-03-01
    4      103      2018-03-29       2018-05-29     66.66      2018-04-01
    4      103      2018-03-29       2018-05-29     66.66      2018-05-01
    5      104      2018-02-05       2018-02-09     50         2018-02-01

Can someone please help me out with this query?

I'm using oracle database

Upvotes: 2

Views: 105

Answers (2)

Barbaros Özhan
Barbaros Özhan

Reputation: 65363

By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :

select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, 
           trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER, 
          '01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS      
      from
      (
      with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
      (
       select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all         
       select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all          
       select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all         
       select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all         
       select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual        
      )
       select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth, 
              VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
              row_number() over (partition by ID order by ID) as rn
         from dual join consumption c 
           on c.ID >= 2
      group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
               c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
      connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
      union all
       select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
              VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
              row_number() over (partition by ID order by ID) as rn
         from dual join consumption c 
           on c.ID  = 1
      group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
               c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
      connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
      ) q
    group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn  
    order by ID, myMonth;

I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all as c.ID >= 2 and c.ID = 1

Rextester Demo

Upvotes: 1

APC
APC

Reputation: 146279

Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.

The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.

This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:

with cte as (
    select f.*
          , ceil(months_between(last_day(CONSUM_DATE_TO)
                                , trunc(CONSUME_DATE_FROM,'mm'))) as diff 
    from fuel_consumption f
)
select cte.id
       , cte.VehicleId
       , cte.CONSUME_DATE_FROM
       , cte.CONSUM_DATE_TO 
       , cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
       , add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
; 

"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"

From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.

Upvotes: 1

Related Questions