Difference between 3NF and BCNF in simple terms (must be able to explain to an 8-year old)

Arnab Datta picture Arnab Datta · Dec 8, 2011 · Viewed 192.2k times · Source

I have read the quote : data depends on the key [1NF], the whole key [2NF] and nothing but the key [3NF].

However, I am having trouble understanding 3.5NF or BCNF as it's called. Here is what I understand :

  • BCNF is stricter than 3NF
  • left side of any FD in the table must be a superkey (or at least a candidate key)

So why is it then, that some 3NF tables are not in BCNF? I mean, the 3NF quote explicitly says "nothing but the key" meaning that all attributes depend solely on the primary key. The primary key is, after all, a candidate key until it is chosen to be our primary key.

If anything is amiss regarding my understanding so far, please correct me and thanks for any help you can provide.

Answer

Bill Karwin picture Bill Karwin · Dec 8, 2011

Your pizza can have exactly three topping types:

  • one type of cheese
  • one type of meat
  • one type of vegetable

So we order two pizzas and choose the following toppings:

Pizza    Topping     Topping Type
-------- ----------  -------------
1        mozzarella  cheese
1        pepperoni   meat
1        olives      vegetable
2        mozzarella  meat
2        sausage     cheese
2        peppers     vegetable

Wait a second, mozzarella can't be both a cheese and a meat! And sausage isn't a cheese!

We need to prevent these sorts of mistakes, to make mozzarella always be cheese. We should use a separate table for this, so we write down that fact in only one place.

Pizza    Topping
-------- ----------
1        mozzarella
1        pepperoni
1        olives
2        mozzarella 
2        sausage
2        peppers

Topping     Topping Type
----------  -------------
mozzarella  cheese
pepperoni   meat
olives      vegetable
sausage     meat
peppers     vegetable

That was the explanation that an 8 year-old might understand. Here is the more technical version.

BCNF acts differently from 3NF only when there are multiple overlapping candidate keys.

The reason is that the functional dependency X -> Y is of course true if Y is a subset of X. So in any table that has only one candidate key and is in 3NF, it is already in BCNF because there is no column (either key or non-key) that is functionally dependent on anything besides that key.

Because each pizza must have exactly one of each topping type, we know that (Pizza, Topping Type) is a candidate key. We also know intuitively that a given topping cannot belong to different types simultaneously. So (Pizza, Topping) must be unique and therefore is also a candidate key. So we have two overlapping candidate keys.

I showed an anomaly where we marked mozarella as the wrong topping type. We know this is wrong, but the rule that makes it wrong is a dependency Topping -> Topping Type which is not a valid dependency for BCNF for this table. It's a dependency on something other than a whole candidate key.

So to solve this, we take Topping Type out of the Pizzas table and make it a non-key attribute in a Toppings table.