Ronald
Ronald

Reputation: 172

Oracle SQL: CONNECT BY LEVEL returning many rows

I am having a first go at the CONNECT BY command, and I understand its potential to create loops with it. I was trying to create a query that generates a row for each timestamp between a starttime and an endtime, with a variable interval. When I run this query for each 'route' individually, it works perfectly. But when I try to run it for both routes at the same time, the loop keeps going.

What am I doing wrong?

SELECT ROUTE_NAME, START_TIME + (LEVEL - 1) * TIME_PERIOD OUTPUT_MOMENT
  FROM (SELECT *
          FROM (SELECT 1 / 24 AS TIME_PERIOD,
                       SYSDATE - 8 / 24 AS START_TIME,
                       SYSDATE + 3 / 24 AS END_TIME,
                       'ROUTE A' ROUTE_NAME
                  FROM DUAL
                UNION ALL
                SELECT 1 / 48 AS TIME_PERIOD,
                       SYSDATE - 8 / 24 AS START_TIME,
                       SYSDATE + 3 / 24 AS END_TIME,
                       'ROUTE B' ROUTE_NAME
                  FROM DUAL)
         WHERE ROUTE_NAME IN ('ROUTE A')
         --WHERE ROUTE_NAME IN ('ROUTE B')
         --WHERE ROUTE_NAME IN ('ROUTE A', 'ROUTE B')
       )
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD

Result on Route A:

* ROUTE_NAME    OUTPUT_MOMENT
* ROUTE A   9/3/2018 5:41:01
* ROUTE A   9/3/2018 6:41:01
* ROUTE A   9/3/2018 7:41:01
* ROUTE A   9/3/2018 8:41:01
* ROUTE A   9/3/2018 9:41:01
* ROUTE A   9/3/2018 10:41:01
* ROUTE A   9/3/2018 11:41:01
* ROUTE A   9/3/2018 12:41:01
* ROUTE A   9/3/2018 13:41:01
* ROUTE A   9/3/2018 14:41:01
* ROUTE A   9/3/2018 15:41:01

Results on Route B:

* ROUTE_NAME    OUTPUT_MOMENT
* ROUTE B   9/3/2018 5:42:34
* ROUTE B   9/3/2018 6:12:34
* ROUTE B   9/3/2018 6:42:34
* ROUTE B   9/3/2018 7:12:34
* ROUTE B   9/3/2018 7:42:34
* ROUTE B   9/3/2018 8:12:34
* ROUTE B   9/3/2018 8:42:34
* ROUTE B   9/3/2018 9:12:34
* ROUTE B   9/3/2018 9:42:34
* ROUTE B   9/3/2018 10:12:34
* ROUTE B   9/3/2018 10:42:34
* ROUTE B   9/3/2018 11:12:34
* ROUTE B   9/3/2018 11:42:34
* ROUTE B   9/3/2018 12:12:34
* ROUTE B   9/3/2018 12:42:34
* ROUTE B   9/3/2018 13:12:34
* ROUTE B   9/3/2018 13:42:34
* ROUTE B   9/3/2018 14:12:34
* ROUTE B   9/3/2018 14:42:34
* ROUTE B   9/3/2018 15:12:34
* ROUTE B   9/3/2018 15:42:34
* ROUTE B   9/3/2018 16:12:34
* ROUTE B   9/3/2018 16:42:34

Results on both (More rows exist):

* ROUTE_NAME    OUTPUT_MOMENT
* ROUTE A   9/3/2018 5:43:21
* ROUTE A   9/3/2018 6:43:21
* ROUTE A   9/3/2018 7:43:21
* ROUTE A   9/3/2018 8:43:21
* ROUTE A   9/3/2018 9:43:21
* ROUTE A   9/3/2018 10:43:21
* ROUTE A   9/3/2018 11:43:21
* ROUTE A   9/3/2018 12:43:21
* ROUTE A   9/3/2018 13:43:21
* ROUTE A   9/3/2018 14:43:21
* ROUTE A   9/3/2018 15:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21
* ROUTE B   9/3/2018 13:13:21
* ROUTE B   9/3/2018 13:43:21
* ROUTE B   9/3/2018 14:13:21
* ROUTE B   9/3/2018 14:43:21
* ROUTE B   9/3/2018 15:13:21
* ROUTE B   9/3/2018 15:43:21
* ROUTE B   9/3/2018 16:13:21
* ROUTE B   9/3/2018 16:43:21
* ROUTE B   9/3/2018 10:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21
* ROUTE B   9/3/2018 13:13:21
* ROUTE B   9/3/2018 13:43:21
* ROUTE B   9/3/2018 14:13:21
* ROUTE B   9/3/2018 14:43:21
* ROUTE B   9/3/2018 15:13:21
* ROUTE B   9/3/2018 15:43:21
* ROUTE B   9/3/2018 16:13:21
* ROUTE B   9/3/2018 16:43:21
* ROUTE B   9/3/2018 10:13:21
* ROUTE A   9/3/2018 15:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21
* ROUTE B   9/3/2018 13:13:21
* ROUTE B   9/3/2018 13:43:21
* ROUTE B   9/3/2018 14:13:21
* ROUTE B   9/3/2018 14:43:21
* ROUTE B   9/3/2018 15:13:21
* ROUTE B   9/3/2018 15:43:21
* ROUTE B   9/3/2018 16:13:21
* ROUTE B   9/3/2018 16:43:21
* ROUTE B   9/3/2018 10:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21
* ROUTE B   9/3/2018 13:13:21
* ROUTE B   9/3/2018 13:43:21
* ROUTE B   9/3/2018 14:13:21
* ROUTE B   9/3/2018 14:43:21
* ROUTE B   9/3/2018 15:13:21
* ROUTE B   9/3/2018 15:43:21
* ROUTE B   9/3/2018 16:13:21
* ROUTE B   9/3/2018 16:43:21
* ROUTE B   9/3/2018 9:43:21
* ROUTE A   9/3/2018 14:43:21
* ROUTE A   9/3/2018 15:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21
* ROUTE B   9/3/2018 13:13:21
* ROUTE B   9/3/2018 13:43:21
* ROUTE B   9/3/2018 14:13:21
* ROUTE B   9/3/2018 14:43:21
* ROUTE B   9/3/2018 15:13:21
* ROUTE B   9/3/2018 15:43:21
* ROUTE B   9/3/2018 16:13:21
* ROUTE B   9/3/2018 16:43:21
* ROUTE B   9/3/2018 10:43:21
* ROUTE B   9/3/2018 11:13:21
* ROUTE B   9/3/2018 11:43:21
* ROUTE B   9/3/2018 12:13:21
* ROUTE B   9/3/2018 12:43:21

Upvotes: 0

Views: 6374

Answers (2)

wolfrevokcats
wolfrevokcats

Reputation: 2100

What you get here is doubling the rows with every level of the hierarchical query.
Consider a simple example:

WITH dbl AS (
    SELECT * FROM dual  UNION ALL SELECT *FROM dual
)
SELECT *FROM dbl CONNECT BY LEVEL <= N

For N=2 the query returns 6 rows, for N=6 it returns 126 rows, for N=10 -- 2046 rows. So we see that the row number grows exponentially.

Your query works in a similar fashion. To fix this you can either move union all to the outer level, running separate hierarchical queries for each route:

SELECT ROUTE_NAME, START_TIME + (lvl - 1) * TIME_PERIOD
FROM (
    SELECT LEVEL AS lvl, ROUTE_NAME AS "ROUTE A"... 
    ...
    FROM DUAL CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
)
UNION ALL 
SELECT ROUTE_NAME, START_TIME + (lvl - 1) * TIME_PERIOD
FROM (
    SELECT LEVEL AS lvl, ROUTE_NAME AS "ROUTE B", ... 
    ...
    FROM DUAL CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
)

Or prevent one route from following the other with ROUTE_NAME = PRIOR ROUTE_NAME, as was suggested in another answer.

Upvotes: 1

Alex Poole
Alex Poole

Reputation: 191235

The connect by is only based on the time, so you're connecting every time for route A with route B and vice versa.

The simple fix seems to be to make it:

CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
AND ROUTE_NAME = PRIOR ROUTE_NAME

to restrict it to the single route at a time; but that then forms a loop, so you need to add in a non-deterministc function call too to prevent that; for example:

CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
AND ROUTE_NAME = PRIOR ROUTE_NAME
AND PRIOR DBMS_RANDOM.VALUE() IS NOT NULL

which gets:

ROUTE_NAME OUTPUT_MOMENT      
---------- -------------------
ROUTE A    2018-03-09 05:00:08
ROUTE A    2018-03-09 06:00:08
ROUTE A    2018-03-09 07:00:08
ROUTE A    2018-03-09 08:00:08
ROUTE A    2018-03-09 09:00:08
ROUTE A    2018-03-09 10:00:08
ROUTE A    2018-03-09 11:00:08
ROUTE A    2018-03-09 12:00:08
ROUTE A    2018-03-09 13:00:08
ROUTE A    2018-03-09 14:00:08
ROUTE A    2018-03-09 15:00:08
ROUTE B    2018-03-09 05:00:08
ROUTE B    2018-03-09 05:30:08
ROUTE B    2018-03-09 06:00:08
ROUTE B    2018-03-09 06:30:08
ROUTE B    2018-03-09 07:00:08
ROUTE B    2018-03-09 07:30:08
ROUTE B    2018-03-09 08:00:08
ROUTE B    2018-03-09 08:30:08
ROUTE B    2018-03-09 09:00:08
ROUTE B    2018-03-09 09:30:08
ROUTE B    2018-03-09 10:00:08
ROUTE B    2018-03-09 10:30:08
ROUTE B    2018-03-09 11:00:08
ROUTE B    2018-03-09 11:30:08
ROUTE B    2018-03-09 12:00:08
ROUTE B    2018-03-09 12:30:08
ROUTE B    2018-03-09 13:00:08
ROUTE B    2018-03-09 13:30:08
ROUTE B    2018-03-09 14:00:08
ROUTE B    2018-03-09 14:30:08
ROUTE B    2018-03-09 15:00:08
ROUTE B    2018-03-09 15:30:08
ROUTE B    2018-03-09 16:00:08

34 rows selected. 

You could also do two connect by queries and union the results together, possibly pulling the time range into a CTE to avoid duplicating that:

WITH START_END AS (
  SELECT SYSDATE - 8 / 24 AS START_TIME,
         SYSDATE + 3 / 24 AS END_TIME
  FROM DUAL
)
SELECT 'ROUTE A' ROUTE_NAME,
       START_TIME + (LEVEL - 1) / 24 AS OUTPUT_MOMENT
  FROM START_END
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / (1 / 24)
UNION ALL
SELECT 'ROUTE B' ROUTE_NAME,
       START_TIME + (LEVEL - 1) / 48 AS OUTPUT_MOMENT
FROM START_END
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / (1 / 48)

Using / ( 1 / 24) looks odd when you could instead do * 24, but you actually get a slightly different result because of rounding errors; with the latter you get an extra row for route A. You could rearrange the logic further to avoid that confusion though.

Upvotes: 2

Related Questions