The Difference Between Logical and Physical Modeling

From WikiContent

Revision as of 04:33, 13 January 2010 by Drsql (Talk | contribs)
Jump to: navigation, search

A data model is a abstract representation of data requirements, from documentation to structure. A common misconception is that the model is only a graphic, though in reality it a complete language of how to communicate the structure. A model usually has an associated picture, giving a powerful overview of the database, but which lacks the fidelity of the complete model.

When creating a data model, there are usually two forms of the model, a logical model and physical model. The logical model usually comes first and represents the implementation nonspecific data requirements, while the physical model covers the implementation specific details used to implement the database.

The term “physical”’s meaning in Codd’s Twelve Rules for a Relational Database Management System in Rule 8: Physical Data Independence. It basically states that the RDBMS can store the data in any way without changing the meaning or programmatic access. Physical has evolved to encompass the SQL objects that are used to implement the system as well.

Logical

The implementation nonspecific requirements are documented here. The logical model is a highly structured document of what the database is attempting to do for the client. The goal is to get the requirements from the user/artifacts, and present a near perfection representation of the necessary data to solve the client’s needs.

Ideally the design will be void of details such as how SQL Server or Oracle would perform some task. Even things that cannot (or simply will not) be implemented in the data tier should be represented in the logical model.

Naming is a big deal, and when naming tables and columns, the names are fully spelled out with little, if any, abbreviations and no concessions to the max length of names in the RDBMS. Tables, relationships, columns, etc should also be defined as to their purpose.

There are constructs that exist in the logical model in a manner that cannot be implemented directly. For example, a many-to-many relationship between two tables may be modeled, but cannot be implemented in a relational database without adding a table.

Physical

There are two sections of the physical model, the RDBMS specific model and the on-disk/performance tuning aspects. In the RDBMS section of the model, you adapt the logical model for implementation in the host RDBMS without dropping expected meaning. Exact types, constraints, triggers, and anything needed to implement the logical model. This model will contain everything that the programmer needs to build a database.

The second part of the model is about adapting the implementation to a combination of users and hardware. In production, the DBA can change the on-disk structures, and even moving the database to a new server, os, platform, etc should be a completely transparent action as long as the RDBMS stays the same.

By maintaining a logical, idealized model describing what the system should represent, seperate from the implementation details, the entire process of developing, testing, and maintaining databases throughout the entire lifecycle can be greatly improved.

Personal tools