Why should I avoid loops when designing relationships for a database?

pgpb.padilla picture pgpb.padilla · Nov 14, 2011 · Viewed 19k times · Source

Someone told me that it was bad design to have loops in the datamodel. I have heard this before a couple of times but didn't pay much attention. For example you have entities User, Project, Activity. A project is owned by a User, so we have a one-to-many relationship from user to Project. An activity can be assigned to a single User, another one-to-many relationship from User to Activity. Of course a project is defined by a set of activities, another one-to-many relationship from Project to Activity. Thus a loop is formed.

I asked this guy why is it bad design but he told me he didn't know either, he was told so too, monkey learning at it's best.

I tried searching but I guess I didn't use the proper words, however this seems to me something that should be fundamental for someone trying to design a DB.

So, can anyone point me to some useful info about loops/cycles in er/db diagrams, should they be avoided?

Answer

sfinnie picture sfinnie · Nov 14, 2011

There's a really good treatment of relationship loops in chapter 3 of this paper (archive.org).

Generally however, the most common issue with loops is consistency of redundant information.

Consider the case (from the paper) where a parent has many children; each child attends a school. There is a third relationship between parent & school ('parent has child at school'). However: you don't want to model the 3rd relationships explicitly; it's completely derivable from the other two. If you did capture it explicitly, you'd need to ensure the loop was always consistent.

So in that case you'd want to avoid the loop. However: loops are not universally bad. Taking the above example again, consider modelling the case where a parent is a governor at a school. That would also create a loop. In this case though it's valid: it's not possible to derive the 'parent is governor at school' relationship from the other two relationships.

So in summary: don't model loops when one relationship is completely derivable from the others combined. But it's OK to create loops when they're not derivable.

Would recommend the paper though; it gives a much better description than I can give here.