Candidate and Superkey

chosenman picture chosenman · Mar 20, 2016 · Viewed 9.1k times · Source

Given a relation schema R with n attributes R(A1, A2, ..., An). What’s the maximum number of possible super-keys for R? Please justify your answer.

Given a relation schema R with n attributes R(A1, A2, ..., An). What’s the maximum number of possible candidate keys for R? Please justify your answer.

I am still wondering on how to answer both of these questions. What I have thought as answer for the first question would be (2^n) - 1 because empty set is not included.

As for the second question. My answer would be n atrributes.

What do you guys think?

Answer

Mahesha999 picture Mahesha999 · Aug 1, 2016

Maximum number of superkeys on relation with n attributes would be number of all possible combinations of attributes. This turns out to be (2^n)-1.

This is nothing but taking

1 attribute from n (nC1) + 2 attributes from n (nC2) + ... + nCn = (2^n)-1

Or we can simply think it as follows: we have each of n attributes represented as a bit. We can put 1 when an attribute has to be a part of superkey or 0 otherwise. So this will be (2^n), because we have two choices (1 or 0) for each of the n bits/attributes. We subtract 1 to avoid all 0's, that is considering 'no-attribute' as a superkey. So (2^n)-1.

This situation can occur when all attributes can functionally determine all other attributes. This occurs when there is a cycle of functional dependencies among attributes. For example if there is a relation R(A,B,C,D), then the FD cycle would be:

A->B
B->C
C->D
D->A

The superkeys would are A,B,C,D,(AB),(AC),(AD),(BC),(BD),(CD),(ABC),(ACD),(ABD),(BCD),(ABCD), total (2^4)-1=15

The maximum possible number of candidate keys will occur for size-r keys where nCr is biggest. Or in other words, when all size-r combinations of attributes are candidate keys, maximum number of candidate keys occur.

This can be seen from above example. Above A,B,C,D are all candidate keys, so none of their superkeys (say (AB), or (BCD) or (ABCD)) are candidate keys. Similarly if, in any relation (AB) is a candidate key, then none of its superkey (say ABC or ABD) can be a candidate key.

In general, nCfloor(n/2) is the maximum number of possible candidate key for relation on n attributes.

PS: this considers the definition that candidate key is a minimal superkey (one from which no attribute can be removed while still leaving it capable to uniquely identify / functionally determine all other attributes)