What should every developer know about databases?

Aaronaught picture Aaronaught · Dec 30, 2009 · Viewed 25.5k times · Source

Whether we like it or not, many if not most of us developers either regularly work with databases or may have to work with one someday. And considering the amount of misuse and abuse in the wild, and the volume of database-related questions that come up every day, it's fair to say that there are certain concepts that developers should know - even if they don't design or work with databases today. So:



What are the important concepts that developers and other software professionals ought to know about databases?


Guidelines for Responses:


Keep your list short.
One concept per answer is best.

Be specific.
"Data modelling" may be an important skill, but what does that mean precisely?

Explain your rationale.
Why is your concept important? Don't just say "use indexes." Don't fall into "best practices." Convince your audience to go learn more.

Upvote answers you agree with.
Read other people's answers first. One high-ranked answer is a more effective statement than two low-ranked ones. If you have more to add, either add a comment or reference the original.

Don't downvote something just because it doesn't apply to you personally.
We all work in different domains. The objective here is to provide direction for database novices to gain a well-founded, well-rounded understanding of database design and database-driven development, not to compete for the title of most-important.

Answer

Walter Mitty picture Walter Mitty · Dec 30, 2009

The very first thing developers should know about databases is this: what are databases for? Not how do they work, nor how do you build one, nor even how do you write code to retrieve or update the data in a database. But what are they for?

Unfortunately, the answer to this one is a moving target. In the heydey of databases, the 1970s through the early 1990s, databases were for the sharing of data. If you were using a database, and you weren't sharing data you were either involved in an academic project or you were wasting resources, including yourself. Setting up a database and taming a DBMS were such monumental tasks that the payback, in terms of data exploited multiple times, had to be huge to match the investment.

Over the last 15 years, databases have come to be used for storing the persistent data associated with just one application. Building a database for MySQL, or Access, or SQL Server has become so routine that databases have become almost a routine part of an ordinary application. Sometimes, that initial limited mission gets pushed upward by mission creep, as the real value of the data becomes apparent. Unfortunately, databases that were designed with a single purpose in mind often fail dramatically when they begin to be pushed into a role that's enterprise wide and mission critical.

The second thing developers need to learn about databases is the whole data centric view of the world. The data centric world view is more different from the process centric world view than anything most developers have ever learned. Compared to this gap, the gap between structured programming and object oriented programming is relatively small.

The third thing developers need to learn, at least in an overview, is data modeling, including conceptual data modeling, logical data modeling, and physical data modeling.

Conceptual data modeling is really requirements analysis from a data centric point of view.

Logical data modeling is generally the application of a specific data model to the requirements discovered in conceptual data modeling. The relational model is used far more than any other specific model, and developers need to learn the relational model for sure. Designing a powerful and relevant relational model for a nontrivial requirement is not a trivial task. You can't build good SQL tables if you misunderstand the relational model.

Physical data modeling is generally DBMS specific, and doesn't need to be learned in much detail, unless the developer is also the database builder or the DBA. What developers do need to understand is the extent to which physical database design can be separated from logical database design, and the extent to which producing a high speed database can be accomplished just by tweaking the physical design.

The next thing developers need to learn is that while speed (performance) is important, other measures of design goodness are even more important, such as the ability to revise and extend the scope of the database down the road, or simplicity of programming.

Finally, anybody who messes with databases needs to understand that the value of data often outlasts the system that captured it.

Whew!