To find infinite recursive loop in CTE

Interstellar picture Interstellar · Jul 31, 2015 · Viewed 16.8k times · Source

I'm not a SQL expert, but if anybody can help me.

I use a recursive CTE to get the values as below.

Child1 --> Parent 1

Parent1 --> Parent 2

Parent2 --> NULL

If data population has gone wrong, then I'll have something like below, because of which CTE may go to infinite recursive loop and gives max recursive error. Since the data is huge, I cannot check this bad data manually. Please let me know if there is a way to find it out.

Child1 --> Parent 1

Parent1 --> Child1

or

Child1 --> Parent 1

Parent1 --> Parent2

Parent2 --> Child1

Answer

a_horse_with_no_name picture a_horse_with_no_name · Jul 31, 2015

With Postgres it's quite easy to prevent this by collecting all visited nodes in an array.

Setup:

create table hierarchy (id integer, parent_id integer);

insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3), 
(5, 4), 
(3, 5); -- endless loop

Recursive query:

with recursive tree as (
  select id, 
         parent_id, 
         array[id] as all_parents
  from hierarchy
  where parent_id is null

  union all

  select c.id, 
         c.parent_id,
         p.all_parents||c.id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
     and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;

To do this for multiple trees at the same time, you need to carry over the ID of the root node to the children:

with recursive tree as (
  select id, 
         parent_id, 
         array[id] as all_parents, 
         id as root_id
  from hierarchy
  where parent_id is null

  union all

  select c.id, 
         c.parent_id,
         p.all_parents||c.id, 
         p.root_id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
     and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
     and c.root_id = p.root_id
)
select *
from tree;