Reputation: 85
I have a table run_schedule
that maintains the future dates associated with specific job runs like the one shown below
id job_type run_dt
1 A 2020-05-25
2 A 2020-05-25
3 A 2020-05-25
4 A 2020-05-25
5 A 2020-05-25
6 A 2020-05-24
7 A 2020-05-24
8 A 2020-05-23
9 B 2020-05-25
10 B 2020-05-25
11 B 2020-05-25
12 B 2020-05-25
13 B 2020-05-25
14 B 2020-05-25
15 B 2020-05-24
16 B 2020-05-24
In my case the number of jobs by type per day varies drastically and some days have a skew.
select job_type, run_dt, count(*)
from run_schedule
group by job_type, run_dt
order by job_type;
job_type run_dt count
A 2020-05-23 1
A 2020-05-25 5
A 2020-05-24 2
B 2020-05-25 6
B 2020-05-24 2
Is there any way in postgres that I can re-distribute the job run dates (run_dt
column) so that it's more uniformly distributed? Ideally I would like to see a distribution like the one below,
job_type run_dt count
A 2020-05-23 3
A 2020-05-25 3
A 2020-05-24 2
B 2020-05-25 3
B 2020-05-24 4
So effectively, I've re-scheduled 2 jobs of type A
which were originally scheduled to run on the 25th to the 23rd and 2 jobs of type B
to run on the 24th bringing them closer to the average of each job type across those days.
Upvotes: 1
Views: 40
Reputation: 222582
You could use window functions to get the minimum and maximum date of each job type, along with the total number of records and the rank of each record within the group. Using this information, you can equally spread jobs across dates:
select
id,
job_type,
date(min_run_dt + (max_run_dt - min_run_dt + 1) * (rn - 1) / cnt * interval '1 day') run_dt
from (
select
t.*,
min(run_dt) over(partition by job_type) min_run_dt ,
max(run_dt) over(partition by job_type) max_run_dt,
count(*) over(partition by job_type) cnt,
row_number() over(partition by job_type order by run_dt) rn
from mytable t
) t
order by job_type, run_dt
For your sample data, this produces:
id | job_type | run_dt -: | :------- | :--------- 8 | A | 2020-05-23 6 | A | 2020-05-23 7 | A | 2020-05-23 4 | A | 2020-05-24 1 | A | 2020-05-24 5 | A | 2020-05-24 2 | A | 2020-05-25 3 | A | 2020-05-25 16 | B | 2020-05-24 15 | B | 2020-05-24 11 | B | 2020-05-24 12 | B | 2020-05-24 13 | B | 2020-05-25 14 | B | 2020-05-25 9 | B | 2020-05-25 10 | B | 2020-05-25
Upvotes: 1