The database development standards in our organization state the varchar fields should not allow null values. They should have a default value of an empty string (""). I know this makes querying and concatenation easier, but today, one of my coworkers questioned me about why that standard only existed for varchar types an not other datatypes (int, datetime, etc). I would like to know if others consider this to be a valid, defensible standard, or if varchar should be treated the same as fields of other data types?
I believe this standard is valid for the following reason:
I believe that an empty string and null values, though technically different, are conceptually the same. An empty, zero length string is a string that does not exist. It has no value. However, a numeric value of 0 is not the same as NULL.
For example, if a field called OutstandingBalance has a value of 0, it means there are $0.00 remaining. However, if the same field is NULL, that means the value is unknown. On the other hand, a field called CustomerName with a value of "" is basically the same as a value of NULL because both represent the non-existence of the name.
I read somewhere that an analogy for an empty string vs. NULL is that of a blank CD vs. no CD. However, I believe this to be a false analogy because a blank CD still phyically exists and still has physical data space that does not have any meaningful data written to it. Basically, I believe a blank CD is the equivalent of a string of blank spaces (" "), not an empty string. Therefore, I believe a string of blank spaces to be an actual value separate from NULL, but an empty string to be the absense of value conceptually equivalent to NULL.
Please let me know if my beliefs regarding variable length strings are valid, or please enlighten me if they are not. I have read several blogs / arguments regarding this subject, but still do not see a true conceptual difference between NULLs and empty strings.
It pretty much boils down to this - in your application, for a specific string, is there a difference between having an empty string to having no string at all?
If there is no distinction, then the standard your are following is fine.
If you find that there is a difference, then the null
has a distinct meaning and should be allowed.
In my experience, null
is normally modelled to mean unknown
.
Here is a more concrete example - middle names of people:
null
may be more appropriateAgain, if your application treats people with no middle name and those where this is unknown identically, then using an empty string for both makes sense (even if it does mean losing some information).