Why does CONNECT BY LEVEL on a table return extra rows?

Ben picture Ben · Nov 24, 2012 · Viewed 75.9k times · Source

Using CONNECT BY LEVEL seems to return too many rows when performed on a table. What is the logic behind what's happening?

Assuming the following table:

create table a ( id number );

insert into a values (1);
insert into a values (2);
insert into a values (3);

This query returns 12 rows (SQL Fiddle).

 select id, level as lvl
   from a
connect by level <= 2
  order by id, level

One row for each in table A with the value of column LVL being 1 and three for each in table A where the column LVL is 2, i.e.:

ID | LVL 
---+-----
 1 |  1 
 1 |  2 
 1 |  2 
 1 |  2 
 2 |  1 
 2 |  2 
 2 |  2 
 2 |  2 
 3 |  1 
 3 |  2 
 3 |  2 
 3 |  2 

It is equivalent to this query, which returns the same results.

 select id, level as lvl
   from dual
  cross join a
connect by level <= 2
  order by id, level

I don't understand why these queries return 12 rows or why there are three rows where LVL is 2 and only one where LVL is 1 for each value of the ID column.

Increasing the number of levels that are "connected" to 3 returns 13 rows for each value of ID. 1 where LVL is 1, 3 where LVL is 2 and 9 where LVL is 3. This seems to suggest that the rows returned are the number of rows in table A to the power of the value of LVL minus 1.

I would have though that these queries would be the same as the following, which returns 6 rows

select id, lvl
  from ( select level  as lvl
           from dual
        connect by level  <= 2
                )
 cross join a
 order by id, lvl

The documentation isn't particularly clear, to me, in explaining what should occur. What's happening with these powers and why aren't the first two queries the same as the third?

Answer

GolezTrol picture GolezTrol · Nov 24, 2012

In the first query, you connect by just the level. So if level <= 1, you get each of the records 1 time. If level <= 2, then you get each level 1 time (for level 1) + N times (where N is the number of records in the table). It is like you are cross joining, because you're just picking all records from the table until the level is reached, without having other conditions to limit the result. For level <= 3, this is done again for each of those results.

So for 3 records:

  • Lvl 1: 3 record (all having level 1)
  • Lvl 2: 3 records having level 1 + 3*3 records having level 2 = 12
  • Lvl 3: 3 + 3*3 + 3*3*3 = 39 (indeed, 13 records each).
  • Lvl 4: starting to see a pattern? :)

It's not really a cross join. A cross join would only return those records that have level 2 in this query result, while with this connect by, you get the records having level 1 as well as the records having level 2, thus resulting in 3 + 3*3 instead of just 3*3 record.