I received the following question and thought I’d answer it on the blog:
Question: I work in a data-modeling environment with the responsibility of logically modeling (LM) data for mainframe and server databases. LM for the OLTP systems are done to 3rd NF. Recently, the DBA recommended LM for localized data in a much less rigid format, i.e. less or potentially no normalization.
Are you aware of any discipline of LM for localized data? If so, can you recommend some reading on this topic?
Here is my answer:
Not only am I not aware of any resource that would advocate logical modeling of data in an unnormalized fashion, I would be against anyone reading it if it existed. The logical data model should be normalized. Normalization is the process of identifying the one best place each fact belongs.
Normalization is a design approach that minimizes data redundancy and optimizes data structures by systematically and properly placing data elements into the appropriate groupings. A normalized data model can be translated into a physical database that is organized correctly.
So the goal of normalization is to eliminate redundancy from data. An entity is in third normal form (3NF) if and only if all non-key columns are (a) mutually independent and (b) fully dependent upon the primary key. Mutual independence means that no non-key column is dependent upon any combination of the other columns. I won’t go into a full explanation of normalization here, though. Suffice it to say that normalization was created by E.F. Codd in the early 1970s. Like the relational model of data, normalization is based on the mathematical principles of set theory. Although normalization evolved from relational theory the process of normalizing data is applicable generally, to any type of data.
It is important to remember that normalization is a logical process and does not necessarily dictate physical database design. A normalized data model will ensure that each entity is well formed and that each attribute is assigned to the proper entity. Of course, the best situation is when a normalized logical data model can be physically implemented without major (or, indeed, any) modifications. However, there are times when the physical database must differ from the logical data model due to physical implementation requirements, hardware/budget constraints, and deficiencies in DBMS products.
Take the proper steps to assure performance in the physical database implementation for the type of applications you will need to create, the service level agreements you will need to support and the DBMS that you will use. This may mean “de-normalized” by combining tables or carrying redundant data (and so on) but this should be undertaken for performance reasons only. And, the logical model should not have any of these “processing” artifacts in it.
For more details on normalization, check out this Wikipedia entry.