I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world)
I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER
.
Why don't use a DATE
field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER
?
Ex:
CREATE TABLE dimDate
id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT,
date DATE NOT NULL,
dayOfWeek
...
VS
CREATE TABLE dimDate
date DATE NOT NULL PRIMARY,
dayOfWeek
...
Date dimension is kind of special -- having date (2011-12-07) or date-related integer (20111207) for a primary key is actually preferred. This allows for nice partitioning (by date) of fact tables.
For other type of dimensions, surrogate (integer) keys are recommended.
As a template, each dimension usually has entries for unknown, not entered, error, ...
which are often matched with keys 0, -1, -2, ...
Due to this, it is more common to find integer-formatted date (20111207) as a primary key instead of date -- it is a bit messy to represent unknown, not entered, error, ...
with date-type key.