Reputation: 91
I have run into this a couple of times where a client is able to import data into a catalog with parent child relationships and I run into problems with said relationships. I need to find a way to prevent the following:
This throws the server into an infinite recursive loop and ultimately brings it to its knees. I can't seem to wrap my head around a SQL query that I could use to detect such recursive madness. The problem is prevalent enough that I need to find some solution. I've tried queries using CTE, nested selects/sub-selects and just can't seem to write one that will solve this issue. Any help would be greatly appreciated.
with recursive parents as (
select
s.id,
s.parent_id,
1 as depth
from categories s
where s.id = <passed in id>
union all
select
t.id,
t.parent_id,
c.depth + 1 as depth
from categories t
inner join parents c
on t.id = c.parent_id
where t.id <> t.parent_id)
select distinct parent_id from parents where parent_id <> 0 order by depth desc
This is what I finally came up with to "detect" a cycle condition
with recursive find_cycle as (
select
categories_id,
parent_id,
0 depth
from
categories
where categories_id = <passed in id>
union all
select
f.categories_id,
c.parent_id,
f.depth + 1
from
categories c
inner join find_cycle f
ON f.parent_id = c.categories_id
where c.parent_id <> c.categories_id
and f.parent_id <> f.categories_id
)
select
f.parent_id as categories_id,
c.parent_id
from find_cycle f
inner join categories c
on f.parent_id = c.categories_id
where exists (
select
1
from find_cycle f
inner join categories c
on f.parent_id = c.categories_id
where f.parent_id = <passed in id>)
order by depth desc;
It will return rows with the offending path and no rows if no cycle detected. Thanks for all the tips folks.
Upvotes: 2
Views: 1715
Reputation: 91
Here is the MariaDB function I came up with that will return 0 if there is not a cycle and 1 if there is a cycle for the id passed in to the function.
create function `detect_cycle`(id int, max_depth int) RETURNS tinyint(1)
begin
declare cycle_exists int default 0;
select (case when count(*) = 1 then 0 else 1 end) into cycle_exists
from
(
with recursive find_cycle as (
select
categories_id,
parent_id,
0 depth
from
categories
where categories_id = id
union all
select
f.categories_id,
c.parent_id,
f.depth + 1
from
categories c
inner join find_cycle f
ON f.parent_id = c.categories_id
where
c.parent_id <> c.categories_id
and f.parent_id <> f.categories_id
and f.depth < max_depth
)
select
c.parent_id
from find_cycle f
inner join categories c
on f.parent_id = c.categories_id
order by depth desc
limit 1
) __temp
where parent_id = 0;
return cycle_exists;
end;
This can then be called by executing
select categories_id, detect_cycle(categories_id, 5) as cycle_exists
from categories
where categories_id = <whatever id you want to check for a cycle condition>;
Here is a stored procedure that will accomplish the same thing but is generic enough to handle any table, id column, parent column combination.
CREATE PROCEDURE `detect_cycle`(table_name varchar(64), id_column varchar(32), parent_id_column varchar(32), max_depth int)
BEGIN
declare id int default 0;
declare sql_query text default '';
declare where_clause text default '';
declare done bool default false;
declare id_cursor cursor for select root_id from __temp_ids;
declare continue handler for not found set done = true;
drop temporary table if exists __temp_ids;
create temporary table __temp_ids(root_id int not null primary key);
set sql_query = concat('
insert into __temp_ids
select
`',id_column,'`
from ',table_name);
prepare statement from sql_query;
execute statement;
drop temporary table if exists __temp_cycle;
create temporary table __temp_cycle (id int not null, parent_id int not null);
open id_cursor;
id_loop: loop
fetch from id_cursor into id;
if done then
leave id_loop;
end if;
set where_clause = concat('where `',id_column,'` = ',id);
set sql_query = concat('
insert into __temp_cycle
select
t.`',id_column,'`,
t.`',parent_id_column,'`
from
(
with recursive find_cycle as (
select
`',id_column,'`,
`',parent_id_column,'`,
0 depth
from
`',table_name,'`
',where_clause,'
union all
select
f.`',id_column,'`,
c.`',parent_id_column,'`,
f.depth + 1
from
`',table_name,'` c
inner join find_cycle f
ON f.`',parent_id_column,'` = c.`',id_column,'`
where
c.`',parent_id_column,'` <> c.`',id_column,'`
and f.`',parent_id_column,'` <> f.`',id_column,'`
and f.depth < ',max_depth,'
)
select
c.`',id_column,'`,
c.`',parent_id_column,'`
from find_cycle f
inner join `',table_name,'` c
on f.`',parent_id_column,'` = c.`',id_column,'`
order by depth desc
limit 1
) t
where t.`',parent_id_column,'` > 0');
prepare statement from sql_query;
execute statement;
end loop;
close id_cursor;
deallocate prepare statement;
select distinct
*
from __temp_cycle;
drop temporary table if exists __temp_ids;
drop temporary table if exists __temp_cycle;
END
usage:
call detect_cycle(table_name, id_column, parent_id_column, max_depth);
This will return a result set of all cycle conditions within the given table.
Upvotes: 2
Reputation: 48875
If the depth of the resulting tree is not extremely deep then you can detect cycles by storing the bread crumbs that the recursive CTE is walking. Knowing the bread crumbs you can detect cycles easily.
For example:
with recursive
n as (
select id, parent_id, concat('/', id, '/') as path
from categories where id = 2
union all
select c.id, c.parent_id, concat(n.path, c.id, '/')
from n
join categories c on c.parent_id = n.id
where n.path not like concat('%/', c.id, '/%') -- cycle pruning here!
)
select * from n;
Result:
id parent_id path
--- ---------- -------
2 1 /2/
3 2 /2/3/
1 3 /2/3/1/
See running example at DB Fiddle.
Upvotes: 0
Reputation: 50308
Looks like you have this figured out to stop a cycling event but are looking for ways to identify a cycle. In that case, consider using a path
:
with recursive parents as (
select
s.id,
s.parent_id,
1 as depth,
CONCAT(s.id,'>',s.parent_id) as path,
NULL as cycle_detection
from categories s
where s.id = <passed in id>
union all
select
t.id,
t.parent_id,
c.depth + 1 as depth,
CONCAT(c.path, '>', t.parent_id),
CASE WHEN c.path LIKE CONCAT('%',t.parent_id,'>%') THEN 'cycle' END
from categories t
inner join parents c
on t.id = c.parent_id
where t.id <> t.parent_id)
select distinct parent_id, cycle_detection from parents where parent_id <> 0 order by depth desc
I may be a bit off my syntax since it's been forever since I wrote mysql/mariadb syntax, but this is the basic idea. Capture the path that the recursion took and then see if your current item is already in the path.
Upvotes: 0