user15746603
user15746603

Reputation:

Struggle to understand SQL query

I'm struggling to understand what the below 2 chunks of code do. Some comments would be really appreciated.

Chunk 1:

    SELECT
        l
    FROM
        (
            SELECT
                level l
            FROM
                dual
            CONNECT BY
                level <= 1000
        ),
        (
            SELECT
                level m
            FROM
                dual
            CONNECT BY
                level <= 1000
        )

Chunk 2:

    HAVING
        COUNT(
            CASE l / m
                WHEN trunc(l / m) THEN
                    'Y'
            END
        ) = 2

PD: I will post below the entire SQL query to provide more background. The query just gets the prime numbers 1 to 1000 and listaggs them with ampersands.

SELECT
    LISTAGG(l, '&') WITHIN GROUP(
        ORDER BY
            l
    )
FROM
    (
        SELECT
            l
        FROM
            (
                SELECT
                    level l
                FROM
                    dual
                CONNECT BY
                    level <= 1000
            ),
            (
                SELECT
                    level m
                FROM
                    dual
                CONNECT BY
                    level <= 1000
            )
        WHERE
            m <= l
        GROUP BY
            l
        HAVING
            COUNT(
                CASE l / m
                    WHEN trunc(l / m) THEN
                        'Y'
                END
            ) = 2
        ORDER BY
            l
    );

Upvotes: 1

Views: 85

Answers (1)

MT0
MT0

Reputation: 167867

[TL;DR] It is a prime number generator.


SELECT level l FROM dual CONNECT BY level <= 1000

Generates 1000 rows containing incrementally increasing numbers from 1 to 1000 and aliases the column to l.

SELECT level m FROM dual CONNECT BY level <= 1000

Does exactly the same thing but aliases the column to m.

SELECT l
FROM   ( SELECT level l FROM dual CONNECT BY level <= 1000 ),
       ( SELECT level m FROM dual CONNECT BY level <= 1000 )

Applies a CROSS JOIN (using the legacy comma-join syntax) between the two tables and returns only the l column; so you will have 1000 copies of each of the numbers from 1 to 1000 (for a total of 1000000 rows)... or would do if there wasn't the filter condition:

WHERE m <= l

Which filters those rows so you don't get 1000 rows of each number but you will get 1 row with the l value of 1, 2 rows with the l value of 2 (with m values of 1 and 2), 3 rows with the value of 3 (with m values of 1, 2, and 3), ... etc. up to 1000 rows with the value of 1000 (with m values of 1, 2, 3, ..., 999, 1000).

GROUP BY l
HAVING COUNT( CASE l / m WHEN trunc(l / m) THEN 'Y' END ) = 2

Counts the number of factors of l (where the value m divides exactly into l) and filters to only have those rows where there are exactly two factors. Those factors must be the values when m equals 1 and l; which means that there are no other factors of l between the values of 1 and l and so l is a prime number.

ORDER BY l

Then orders the values in ascending order.


A more efficient generator would be:

SELECT 2 AS l FROM DUAL UNION ALL
SELECT 3 AS l FROM DUAL UNION ALL
SELECT l
FROM   ( SELECT 2 * level + 3 l FROM dual CONNECT BY 2 * level + 3 <= 1000 )
       CROSS APPLY
       ( SELECT 2 * level + 1 m FROM dual CONNECT BY POWER(2 * level + 1, 2) <= l )
GROUP BY l
HAVING COUNT( CASE l / m WHEN trunc(l / m) THEN 1 END ) = 0
ORDER BY l

Which correlates the row generators to only generate odd numbers and so that the second generator stops generating at the square root of the value of the first generator.

Upvotes: 4

Related Questions