Database schema normalization checker?

Greg Mattes picture Greg Mattes · Feb 12, 2009 · Viewed 6.9k times · Source

I'm interested in learning about tools that operate like this:

Given a database schema and some data, suggest whether the schema is likely structured in any particular normal form, and then tell how the schema might be factored to produce further normalization.

Basically, a static analysis tool for database schema design.

Like other static analysis tools, such a database tool need not generate perfect results (I doubt such a "perfect" tool is computationally feasible), or applicable to all database systems, or free/open source, or anything else. The tool need not be stand-alone; it could be bundled as some sophisticated IDE feature. I'm just wondering what's out there.

Answer

Bill Karwin picture Bill Karwin · Feb 12, 2009

A tool like you describe, that tried to analyze your data and metadata and advise you of possible non-normalized structure, would give wrong suggestions so frequently that it would make Clippy, the Microsoft assistant seem like an indispensable writer's aid.

enter image description here

The process of normalization involves mapping software requirements to a logical data model. An analysis tool cannot know your data requirements any better than you do. So it can't infer from an incorrect database design which parts of it are wrong.

I understand you have qualified the question and you have limited expectations for the tool's capability. But you would want it to be useful for ordinary, everyday tasks -- but it would not be reliable even for the most simple cases.

Compare to static code analysis tools. Suppose you write an application and deliver it to your client, and the client says "why can't I send email from this app?" How would a static code analysis tool tell you that you omitted a desired feature? It can't know those requirements.

Likewise, how would a database normalization analysis tool know whether it's appropriate for the UserAccount table to have a single MobilePhoneNumber attribute, or if it would be more appropriate to separate phone numbers into another table so a single user could have multiple phones listed?