Don Ellis
Don Ellis

Reputation: 91

Recursive parent child problem in MariaDB

I have run into this a couple of times where a client is able to import data into a catalog with parent child relationships and I run into problems with said relationships. I need to find a way to prevent the following:

This throws the server into an infinite recursive loop and ultimately brings it to its knees. I can't seem to wrap my head around a SQL query that I could use to detect such recursive madness. The problem is prevalent enough that I need to find some solution. I've tried queries using CTE, nested selects/sub-selects and just can't seem to write one that will solve this issue. Any help would be greatly appreciated.

with recursive parents as (
    select
        s.id,
        s.parent_id,
        1 as depth
    from categories s
    where s.id = <passed in id>
    union all
    select
        t.id,
        t.parent_id,
        c.depth + 1 as depth
    from categories t
        inner join parents c
            on t.id = c.parent_id
    where t.id <> t.parent_id)
select distinct parent_id from parents where parent_id <> 0 order by depth desc

This is what I finally came up with to "detect" a cycle condition

with recursive find_cycle as (
    select
        categories_id,
        parent_id,
        0 depth 
    from
        categories
    where categories_id = <passed in id>
    union all
    select
        f.categories_id,
        c.parent_id,
        f.depth + 1 
    from
        categories c
            inner join find_cycle f
                ON f.parent_id = c.categories_id 
    where c.parent_id <> c.categories_id 
        and f.parent_id <> f.categories_id 
    )
select
    f.parent_id as categories_id,
    c.parent_id
from find_cycle f
    inner join categories c
        on f.parent_id = c.categories_id
where exists (
    select
        1
    from find_cycle f
        inner join categories c
            on f.parent_id = c.categories_id
    where f.parent_id = <passed in id>)
order by depth desc;

It will return rows with the offending path and no rows if no cycle detected. Thanks for all the tips folks.

Upvotes: 2

Views: 1715

Answers (3)

Don Ellis
Don Ellis

Reputation: 91

Here is the MariaDB function I came up with that will return 0 if there is not a cycle and 1 if there is a cycle for the id passed in to the function.

create function `detect_cycle`(id int, max_depth int) RETURNS tinyint(1)
begin
    declare cycle_exists int default 0;
    
    select (case when count(*) = 1 then 0 else 1 end) into cycle_exists
    from
    (
        with recursive find_cycle as (
            select
                categories_id,
                parent_id,
                0 depth 
            from
                categories
            where categories_id = id
            union all
            select
                f.categories_id,
                c.parent_id,
                f.depth + 1 
            from
                categories c
                    inner join find_cycle f
                        ON f.parent_id = c.categories_id 
            where 
                c.parent_id <> c.categories_id 
                    and f.parent_id <> f.categories_id
                    and f.depth < max_depth
            )
        select
            c.parent_id
        from find_cycle f
            inner join categories c
                on f.parent_id = c.categories_id
        order by depth desc
        limit 1
    ) __temp
    where parent_id = 0;

    return cycle_exists;
end;

This can then be called by executing

select categories_id, detect_cycle(categories_id, 5) as cycle_exists
from categories
where categories_id = <whatever id you want to check for a cycle condition>;

Here is a stored procedure that will accomplish the same thing but is generic enough to handle any table, id column, parent column combination.

CREATE PROCEDURE `detect_cycle`(table_name varchar(64), id_column varchar(32), parent_id_column varchar(32), max_depth int)
BEGIN
    declare id int default 0;
    declare sql_query text default '';
    declare where_clause text default '';
    declare done bool default false;
    declare id_cursor cursor for select root_id from __temp_ids;
    declare continue handler for not found set done = true;

    drop temporary table if exists __temp_ids;
    create temporary table __temp_ids(root_id int not null primary key);
    
    set sql_query = concat('
        insert into __temp_ids
        select
            `',id_column,'`
        from ',table_name);
    
    prepare statement from sql_query;
    execute statement;
    
    drop temporary table if exists __temp_cycle;
    create temporary table __temp_cycle (id int not null, parent_id int not null);
    
    open id_cursor;
    id_loop: loop
        fetch from id_cursor into id;
        if done then
            leave id_loop;
        end if;
        set where_clause = concat('where `',id_column,'` = ',id);
        set sql_query = concat('
            insert into __temp_cycle
            select
                t.`',id_column,'`,
                t.`',parent_id_column,'`
            from
            (
                    with recursive find_cycle as (
                            select
                                    `',id_column,'`,
                                    `',parent_id_column,'`,
                                    0 depth 
                            from
                                    `',table_name,'`
                            ',where_clause,'
                            union all
                            select
                                    f.`',id_column,'`,
                                    c.`',parent_id_column,'`,
                                    f.depth + 1 
                            from
                                    `',table_name,'` c
                                            inner join find_cycle f
                                                    ON f.`',parent_id_column,'` = c.`',id_column,'` 
                            where 
                                    c.`',parent_id_column,'` <> c.`',id_column,'` 
                                            and f.`',parent_id_column,'` <> f.`',id_column,'`
                                            and f.depth < ',max_depth,'
                            )
                    select
                            c.`',id_column,'`,
                            c.`',parent_id_column,'`
                    from find_cycle f
                            inner join `',table_name,'` c
                                    on f.`',parent_id_column,'` = c.`',id_column,'`
                    order by depth desc
                    limit 1
            ) t
            where t.`',parent_id_column,'` > 0');
        prepare statement from sql_query;
        execute statement;
    end loop;
    close id_cursor;
    
    deallocate prepare statement;
    
    select distinct
        *
    from __temp_cycle;
    
    drop temporary table if exists __temp_ids;
    drop temporary table if exists __temp_cycle;
END

usage:

call detect_cycle(table_name, id_column, parent_id_column, max_depth);

This will return a result set of all cycle conditions within the given table.

Upvotes: 2

The Impaler
The Impaler

Reputation: 48875

If the depth of the resulting tree is not extremely deep then you can detect cycles by storing the bread crumbs that the recursive CTE is walking. Knowing the bread crumbs you can detect cycles easily.

For example:

with recursive
n as (
  select id, parent_id, concat('/', id, '/') as path 
  from categories where id = 2
 union all
  select c.id, c.parent_id, concat(n.path, c.id, '/')
  from n
  join categories c on c.parent_id = n.id
  where n.path not like concat('%/', c.id, '/%') -- cycle pruning here!
)
select * from n;

Result:

 id  parent_id  path    
 --- ---------- ------- 
 2   1          /2/     
 3   2          /2/3/   
 1   3          /2/3/1/ 

See running example at DB Fiddle.

Upvotes: 0

JNevill
JNevill

Reputation: 50308

Looks like you have this figured out to stop a cycling event but are looking for ways to identify a cycle. In that case, consider using a path:

with recursive parents as (
    select
        s.id,
        s.parent_id,
        1 as depth,
        CONCAT(s.id,'>',s.parent_id) as path,
        NULL as cycle_detection
    from categories s
    where s.id = <passed in id>
    union all
    select
        t.id,
        t.parent_id,
        c.depth + 1 as depth,
        CONCAT(c.path, '>', t.parent_id),
        CASE WHEN c.path LIKE CONCAT('%',t.parent_id,'>%') THEN 'cycle' END
    from categories t
        inner join parents c
            on t.id = c.parent_id
    where t.id <> t.parent_id)
select distinct parent_id, cycle_detection from parents where parent_id <> 0 order by depth desc

I may be a bit off my syntax since it's been forever since I wrote mysql/mariadb syntax, but this is the basic idea. Capture the path that the recursion took and then see if your current item is already in the path.

Upvotes: 0

Related Questions