I have read the quote : data depends on the key [1NF], the whole key [2NF] and nothing but the key [3NF].
However, I am having trouble understanding 3.5NF or BCNF as it's called. Here is what I understand :
So why is it then, that some 3NF tables are not in BCNF? I mean, the 3NF quote explicitly says "nothing but the key" meaning that all attributes depend solely on the primary key. The primary key is, after all, a candidate key until it is chosen to be our primary key.
If anything is amiss regarding my understanding so far, please correct me and thanks for any help you can provide.
Your pizza can have exactly three topping types:
So we order two pizzas and choose the following toppings:
Pizza Topping Topping Type
-------- ---------- -------------
1 mozzarella cheese
1 pepperoni meat
1 olives vegetable
2 mozzarella meat
2 sausage cheese
2 peppers vegetable
Wait a second, mozzarella can't be both a cheese and a meat! And sausage isn't a cheese!
We need to prevent these sorts of mistakes, to make mozzarella always be cheese. We should use a separate table for this, so we write down that fact in only one place.
Pizza Topping
-------- ----------
1 mozzarella
1 pepperoni
1 olives
2 mozzarella
2 sausage
2 peppers
Topping Topping Type
---------- -------------
mozzarella cheese
pepperoni meat
olives vegetable
sausage meat
peppers vegetable
That was the explanation that an 8 year-old might understand. Here is the more technical version.
BCNF acts differently from 3NF only when there are multiple overlapping candidate keys.
The reason is that the functional dependency X -> Y
is of course true if Y
is a subset of X
. So in any table that has only one candidate key and is in 3NF, it is already in BCNF because there is no column (either key or non-key) that is functionally dependent on anything besides that key.
Because each pizza must have exactly one of each topping type, we know that (Pizza, Topping Type) is a candidate key. We also know intuitively that a given topping cannot belong to different types simultaneously. So (Pizza, Topping) must be unique and therefore is also a candidate key. So we have two overlapping candidate keys.
I showed an anomaly where we marked mozarella as the wrong topping type. We know this is wrong, but the rule that makes it wrong is a dependency Topping -> Topping Type
which is not a valid dependency for BCNF for this table. It's a dependency on something other than a whole candidate key.
So to solve this, we take Topping Type out of the Pizzas table and make it a non-key attribute in a Toppings table.