The Difference Between Logical and Physical Modeling

From WikiContent

Revision as of 06:17, 12 January 2010 by Drsql (Talk | contribs)
Jump to: navigation, search

A data model is a representation of data in a structured format. Everything about the data should be covered, from documentation to structure. A common misnomer is that the model is only a graphic, though this is not the case. A model usually has an associated picture, which is there for the picture is worth a thousand words effect.

When creating a data model, there are usually two types of models created, a logical model and physical model. The logical model usually comes first and represents the implementation nonspecific data requirements, while the physical model covers the implementation specific details used to implement the database.

The term “physical”’s meaning in Codd’s Twelve Rules for a Relational Database Management System in Rule 8: Physical Data Independence. It basically states that the RDBMS can store the data in any way without changing the meaning or programmatic access. Physical has evolved to encompass the SQL objects that are used to implement the system as well.


The initial implementation nonspecific requirements are commonly documented here. The logical model is a highly structured document of what the database is attempting to do for the client. The goal is to get the requirements from the user/artifacts, and present a near perfection representation of the necessary data to solve the client’s needs.

Ideally the design will be void of details such as how SQL Server or Oracle would perform some task. Even things that cannot (or simply will not) be implemented in the data tier should be represented in the logical model.

Naming is a big deal, and when naming tables and columns, the names are fully spelled out with little, if any, abbreviations and no concessions to the max length of names in the RDBMS. Tables, relationships, columns, etc should be defined as to their purposes.

There are constructs that exist in the logical model in a manner that cannot be implemented directly. For example, a many-to-many relationship between two tables may be modeled, but cannot be implemented in a relational database without adding a table.


There are two sections of the physical model, the RDBMS specific model and the on-disk/performance tuning aspects. In the RDBMS section of the model, you adapt the logical model for implementation in the host RDBMS without dropping expected meaning. Exact types, constraints, triggers, and anything needed to implement the logical model. This model will contain everything that the programmer needs to build a database.

The second part of the model is about adapting the implementation to a combination of users and hardware. In production, the DBA can change the on-disk structures, and even moving the database to a new server, os, platform, should be a completely transparent action as long as the RDBMS stays the same.

By maintaining a logical, idealized model of what the system should represent, separate from the implementation details, the entire process of developing, testing, and maintaining databases throughout the entire lifecycle can be greatly improved.

Personal tools