Preben Huybrechts
Preben Huybrechts

Reputation: 6111

Fill gaps with set based on priority

Goal

For each Foo we should try the fill the time frames as much as possible with records from FooBar. It's ok to have empty time frames when no records exists in FooBar. The records in FooBar behave as a set. Meaning if FooId and the time frame are (exactly) the same the BarId's are all valid in this time frame. Gaps should be filled based on the sets priority.

Table structure

CREATE TABLE Foo(
    FooId INT NOT NULL,
    ValidFrom DATETIME NOT NULL,
    ValidUntil DATETIME NOT NULL,
)

CREATE TABLE FooBar (
    FooId INT NOT NULL,
    BarId INT NOT NULL,
    ValidFrom DATETIME NOT NULL,
    ValidUntil DATETIME NOT NULL,
    Priority TINYINT NOT NULL,
)

Sample data

INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES 
  (1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')

INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority)
VALUES 
-- First set for FooId = 1 with prio 1
  (1, 1, '2021-01-01', '2021-03-01', 1)
, (1, 2, '2021-01-01', '2021-03-01', 1)
, (1, 3, '2021-01-01', '2021-03-01', 1)
-- Second set for FooId = 1 with prio 2 
, (1, 1, '2021-02-01', '2021-06-01', 2)
, (1, 2, '2021-02-01', '2021-06-01', 2)
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3)
, (1, 2, '2021-01-01', '2021-12-31', 3)
, (1, 3, '2021-01-01', '2021-12-31', 3)
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1)
, (1, 5, '2021-04-01', '2021-04-02', 1)
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3)

Expected result Origin column to clarify, shouldn't be part of the generated result set

FooId BarId ValidFrom ValidUntil Origin
1 1 2021-01-01 2021-03-01 First set
1 2 2021-01-01 2021-03-01 First set
1 3 2021-01-01 2021-03-01 First set
1 1 2021-03-02 2021-03-31 Second set
1 2 2021-03-02 2021-03-31 Second set
1 4 2021-04-01 2021-04-02 Fourth set
1 5 2021-04-01 2021-04-02 Fourth set
1 1 2021-04-03 2021-06-01 Second set
1 2 2021-04-03 2021-06-01 Second set
1 1 2021-06-02 2021-12-31 Third set
1 2 2021-06-02 2021-12-31 Third set
1 3 2021-06-02 2021-12-31 Third set
2 6 2021-01-01 2021-12-31 First set (FooId = 2)

I know this is possible with a cursor or while loop, but I'm looking for a more performant/elegant solution.

Compatibility level is: 130

Upvotes: 2

Views: 82

Answers (1)

Sander
Sander

Reputation: 4042

It's ok to have empty time frames when no records exists in FooBar.

Does this mean that a solution without empty frames is also acceptable?
If so, then the third set for FooId = 1 also defines a BarId = 3 for period 2021-03-02 -> 2021-03-31 for example.

Sample data

Tweaked the data model a bit to no have those timestamps (00:00:00.000) in the result. Also added a set identifier (FooBar.SetId) for easy origin tracing.

CREATE TABLE Foo(
    FooId INT NOT NULL,
    ValidFrom DATE/*TIME*/ NOT NULL,
    ValidUntil DATE/*TIME*/ NOT NULL
)

CREATE TABLE FooBar (
    FooId INT NOT NULL,
    BarId INT NOT NULL,
    ValidFrom DATE/*TIME*/ NOT NULL,
    ValidUntil DATE/*TIME*/ NOT NULL,
    Priority TINYINT NOT NULL,
    SetId nvarchar(5)
)

INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES 
  (1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')

INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority, SetId)
VALUES 
-- First set for FooId = 1 with prio 1
  (1, 1, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 2, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 3, '2021-01-01', '2021-03-01', 1, 'Set 1')
-- Second set for FooId = 1 with prio 2 
, (1, 1, '2021-02-01', '2021-06-01', 2, 'Set 2')
, (1, 2, '2021-02-01', '2021-06-01', 2, 'Set 2')
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 2, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 3, '2021-01-01', '2021-12-31', 3, 'Set 3')
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1, 'Set 4')
, (1, 5, '2021-04-01', '2021-04-02', 1, 'Set 4')
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3, 'Set 1')

Solution

  1. The common table expressions (CTE) ValidFrom and ValidPeriod cut all period information from Foo and FooBar in the smallest individual periods.
  2. The previous step also produces an extra trailing, incomplete period for each FooId that is remove with the exists clause.
  3. Then for each individual period fetch the FooBar record with the first priority value (by saying that no similar record with an even smaller priority is allowed: not exists ... fb2.Priority < fb.Priority).

This gives:

with ValidFrom as
(
  select f.FooId,
         f.ValidFrom
  from Foo f
    union
  select f.FooId,
         dateadd(day, 1, f.ValidUntil)
  from Foo f
    union
  select fb.FooId,
         fb.ValidFrom
  from FooBar fb
    union
  select fb.FooId,
         dateadd(day, 1, fb.ValidUntil)
  from Foobar fb
),
ValidPeriod as
(
  select vf.FooId,
         vf.ValidFrom,
         dateadd(day, -1, lead(vf.ValidFrom) over(partition by vf.FooId order by vf.ValidFrom)) as ValidUntil
  from ValidFrom vf
)
select vp.FooId,
       fb.BarId,
       vp.ValidFrom,
       vp.ValidUntil,
     --fb.ValidFrom,
     --fb.ValidUntil,
     --fb.Priority,
       fb.SetId
from ValidPeriod vp
left join FooBar fb
  on  fb.FooId = vp.FooId
  and fb.ValidFrom <= vp.ValidUntil
  and fb.ValidUntil >= vp.ValidFrom
  and not exists ( select 'x'
                   from FooBar fb2
                   where fb2.FooId = fb.FooId
                     and fb2.BarId = fb.BarId
                     and fb2.ValidFrom <= vp.ValidUntil
                     and fb2.ValidUntil >= vp.ValidFrom
                     and fb2.Priority < fb.Priority )
where exists ( select 'x'
               from ValidPeriod vp2
               where vp2.FooId = vp.FooId
                 and vp2.ValidFrom > vp.ValidFrom )
order by vp.FooId,
         vp.ValidFrom,
         fb.BarId;

Result

This result contains more period information than those you requested in your expected result. Removing the union's with Foo from the first CTE will remove the null values and limit the period to the period information available in FooBar alone (in fact this would eliminate Foo from the solution entirely).

With vp.ValidFrom and vp.ValidUntil as result periods:

FooId  BarId  ValidFrom   ValidUntil  SetId
-----  -----  ----------  ----------  -----
1      null   2020-01-01  2020-12-31  null  -- extra row
1      1      2021-01-01  2021-01-31  Set 1
1      2      2021-01-01  2021-01-31  Set 1
1      3      2021-01-01  2021-01-31  Set 1
1      1      2021-02-01  2021-03-01  Set 1 -- extra row
1      2      2021-02-01  2021-03-01  Set 1 -- extra row
1      3      2021-02-01  2021-03-01  Set 1 -- extra row
1      1      2021-03-02  2021-03-31  Set 2
1      2      2021-03-02  2021-03-31  Set 2
1      3      2021-03-02  2021-03-31  Set 3 -- extra row
1      1      2021-04-01  2021-04-02  Set 2 -- extra row
1      2      2021-04-01  2021-04-02  Set 2 -- extra row
1      3      2021-04-01  2021-04-02  Set 3 -- extra row
1      4      2021-04-01  2021-04-02  Set 4
1      5      2021-04-01  2021-04-02  Set 4
1      1      2021-04-03  2021-06-01  Set 2
1      2      2021-04-03  2021-06-01  Set 2
1      3      2021-04-03  2021-06-01  Set 3 -- extra row
1      1      2021-06-02  2021-12-31  Set 3
1      2      2021-06-02  2021-12-31  Set 3
1      3      2021-06-02  2021-12-31  Set 3
2      null   2020-01-01  2020-12-31  null  -- extra row
2      6      2021-01-01  2021-04-02  Set 1
2      null   2021-04-03  2021-06-30  null  -- extra row

With fb.ValidFrom and fb.ValidUntil as result periods:

FooId  BarId  ValidFrom   ValidUntil  SetId
-----  -----  ----------  ----------  -----
1      null   null        null        null  -- extra row
1      1      2021-01-01  2021-03-01  Set 1
1      2      2021-01-01  2021-03-01  Set 1
1      3      2021-01-01  2021-03-01  Set 1
1      1      2021-01-01  2021-03-01  Set 1 -- extra row
1      2      2021-01-01  2021-03-01  Set 1 -- extra row
1      3      2021-01-01  2021-03-01  Set 1 -- extra row
1      1      2021-02-01  2021-06-01  Set 2
1      2      2021-02-01  2021-06-01  Set 2
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      1      2021-02-01  2021-06-01  Set 2 -- extra row
1      2      2021-02-01  2021-06-01  Set 2 -- extra row
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      4      2021-04-01  2021-04-02  Set 4
1      5      2021-04-01  2021-04-02  Set 4
1      1      2021-02-01  2021-06-01  Set 2
1      2      2021-02-01  2021-06-01  Set 2
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      1      2021-01-01  2021-12-31  Set 3
1      2      2021-01-01  2021-12-31  Set 3
1      3      2021-01-01  2021-12-31  Set 3
2      null   null        null        null  -- extra row
2      6      2021-01-01  2021-04-02  Set 1
2      null   null        null        null  -- extra row

Fiddle to see things in action.

Upvotes: 1

Related Questions