Lossless Join and Decomposition From Functional Dependencies

DjMix picture DjMix · May 11, 2014 · Viewed 17.1k times · Source

Suppose the relation R( K, L, M, N, P), and the functional dependencies that hold on R are:

 - L  -> P
 - MP -> K
 - KM -> P
 - LM -> N

Suppose we decompose it into 3 relations as follows:

 - R1(K, L, M)
 - R2(L, M, N)
 - R3(K, M, P)

How can we tell whether this decomposition is lossless? I used this example

R1 ∩ R2 = {L, M}, R2 ∩ R3 = {M}, R1 ∩ R3 = {K,M} we use functional dependencies, and this is not lossless in my opinion, but a little bit confused.

Answer

SáT picture SáT · May 13, 2014

It helps if we demystify the concept of lossless decomposition a bit: it really just means that joining R1, R2 and R3 should yield the original R.

Do you know the chase test a.k.a "the tableau method"? It's a cool algorithm to test for losslessness. It's easy to program, and it's actually used in the industry when reasoning about data consistency.

We start with the so-called "tableau of the decomposition", an n*m matrix where n is the number of relations and m is the number of attributes. For every field, if the relation n contains attribute m we write the attribute name; otherwise we write the attribute name subscripted with the number of the relation.

   |  K   L   M   N   P
 -----------------------
 1 |  K   L   M   n1  p1
 2 |  k2  L   M   N   p2
 3 |  K   l3  M   n3  P

Now the tableau will be "chased" (hence the name of the algorithm). We notice that the first and second rows agree on their L and M values. The dependency LM->N implies that their N values should agree too. Let's change the first row's n1 into the second row's N:

   |  K   L   M   N   P
 -----------------------
 1 |  K   L   M   N   p1
 2 |  k2  L   M   N   p2
 3 |  K   l3  M   n3  P

Now the first and third rows agree on their K and M values. We have a KM->P dependency, so they should agree on their P value also.

   |  K   L   M   N   P
 -----------------------
 1 |  K   L   M   N   P
 2 |  k2  L   M   N   p2
 3 |  K   l3  M   n3  P

And we're done! As soon as any of the rows has all of the proper attributes (as the first row does), the algorithm terminates and proves the decomposition was indeed lossless.

Note how the repeated applications of the dependencies represent joining the relations on the keys they share. So what we did was joining R1 and R2 on L and M (netting us (K, L M, N)), then joining the result with R3 on K and M (that yields R).