Reputation: 57
I am making a query for retrieving "future months" based on a project's forecast budget.
Basically I am taking the last forecasted date (START_DATE) and the future date which I wish to make the forecast to (END_DATE) so I need to fill everything in between with months.
Through some research I found "CONNECT BY" could help a lot.
Simplyfying it, the query looks like this:
SELECT TO_CHAR (ADD_MONTHS (START_DATE, LEVEL - 1), 'fmMonth')
FROM (SELECT PROJECT_ID, FORECAST_VALUE, START_DATE, END_DATE
FROM PROJECTS
WHERE PROJECT_ID = 001)
CONNECT BY LEVEL <=
MONTHS_BETWEEN (TRUNC (END_DATE, 'MM'),
TRUNC (START_DATE, 'MM')
)
* +1
The query works just fine when selecting one project, however when selecting several or all projects/rows, the query breaks down and returns many more rows than expected.
The source data look something like this:
PROJECT_ID | FORECAST_VALUE | START_DATE | END_DATE
-----------+----------------+------------+-----------
001 | 100 | 2017-01-01 | 2017-03-01
002 | 200 | 2017-01-01 | 2017-05-01
003 | 200 | 2017-10-01 | 2018-01-01
What I would expect to see would be something like this
PROJECT_ID | FORECAST_VALUE | FORECAST_YEAR | FORECAST_MONTH
-----------+----------------+---------------+-----------
001 | 100 | 2017 | JANUARY
001 | 100 | 2017 | FEBRUARY
001 | 100 | 2017 | MARCH
002 | 200 | 2017 | JANUARY
002 | 200 | 2017 | FEBRUARY
002 | 200 | 2017 | MARCH
002 | 200 | 2017 | APRIL
002 | 200 | 2017 | MAY
003 | 200 | 2017 | OCTOBER
003 | 200 | 2017 | NOVEMBER
003 | 200 | 2017 | DECEMBER
003 | 200 | 2018 | JANUARY
However I get a ton more months and years than expected.
How can I solve this?
Thanks!
Upvotes: 2
Views: 159
Reputation:
Since you have no condition other than the one you put in the CONNECT BY
, each row at each level generates many more rows at the next level (there is no tracking of each PROJECT_ID
at each level). You need to link the rows by PROJECT_ID = PRIOR PROJECT_ID
. But this will lead to "cycles"; CONNECT BY...
detects cycles by looking at the columns affected by the PRIOR
operator only, not at ALL the columns. You can break cycles by adding an irrelevant PRIOR
condition that will guarantee different values for different rows; traditionally SYS_GUID()
is used for that.
Modify your query as follows:
SELECT TO_CHAR (ADD_MONTHS (START_DATE, LEVEL - 1), 'fmMonth')
FROM (SELECT PROJECT_ID, FORECAST_VALUE, START_DATE, END_DATE
FROM PROJECTS
WHERE PROJECT_ID = 001)
CONNECT BY LEVEL <=
MONTHS_BETWEEN (TRUNC (END_DATE, 'MM'),
TRUNC (START_DATE, 'MM')
)
* +1 -- whatever that means (copied from original post)
AND PROJECT_ID = PRIOR PROJECT_ID
AND PRIOR SYS_GUID() IS NOT NULL
I am assuming, of course, that PROJECT_ID
is a unique key (perhaps Primary Key?) in the base table PROJECTS
.
Upvotes: 1
Reputation: 27261
Here is one way to do it. We simply take minimum start_date
and maximum end_date
and generate everything in between, and then join to our projects
table.
create table projects(project_id, forecast_value, start_date, end_date) as(
select 001, 100, date '2017-01-01', date '2017-03-01' from dual union all
select 002, 200, date '2017-01-01', date '2017-05-01' from dual union all
select 003, 200, date '2017-10-01', date '2018-01-01' from dual
);
with
dates(dt) as(
select add_months(s_date, level - 1) as dt
from (
select min(start_date) as s_date
, max(end_date) as e_date
from projects
)
connect by add_months(s_date , level - 1) <= e_date
)
select p.project_id
, p.forecast_value
, extract(year from d.dt) as forcast_year
, to_char(d.dt, 'MONTH') as forecast_month
from projects p
join dates d
on (trunc(d.dt, 'mm') between trunc(p.start_date, 'mm')
and trunc(p.end_date, 'mm'))
order by p.project_id, d.dt
Result:
PROJECT_ID FORECAST_VALUE FORCAST_YEAR FORECAST_MONTH
---------- -------------- ------------ --------------
1 100 2017 JANUARY
1 100 2017 FEBRUARY
1 100 2017 MARCH
2 200 2017 JANUARY
2 200 2017 FEBRUARY
2 200 2017 MARCH
2 200 2017 APRIL
2 200 2017 MAY
3 200 2017 OCTOBER
3 200 2017 NOVEMBER
3 200 2017 DECEMBER
3 200 2018 JANUARY
12 rows selected.
Upvotes: 0
Reputation: 22969
A simple way could be joining your table with a table of numbers, assuming that you will have no more than, say, 1000 months:
select PROJECT_ID, FORECAST_VALUE, START_DATE, END_DATE, TO_CHAR (ADD_MONTHS (START_DATE, num - 1), 'fmMonth')
from PROJECTS
inner join (
select level as num
from dual
connect by level <= 1000
) nums
on (num -1 <= months_between( TRUNC (END_DATE, 'MM'),
TRUNC (START_DATE, 'MM')) )
order by 1, num
Upvotes: 0