The Difference Between Logical and Physical Modeling

From WikiContent

(Difference between revisions)
Jump to: navigation, search
(There are many differences between the logical and physical model, and even another model to consider between the two.)
Line 1: Line 1:
-
The term “physical” has two meanings that cause a lot of confusion when DBA’s are talking to each other. The first encompasses SQL objects: tables, columns, constraints, etc. The second uses the term as referring to data is stored in the hardware.
+
A data model is a representation of data in a structured format. Everything about the data should be covered, from documentation to structure. A common misnomer is that the model is only a graphic, though this is not the case. A model usually has an associated picture, which is there for the picture is worth a thousand words effect.
-
The latter usage fits better with the original meaning that was coined in Codd’s Twelve Rules for a Relational Database Management System in Rule 8: Physical Data Independence. It basically states that the RDBMS should be allowed to store the data in any way or anywhere it pleases without changing the meaning or access.
+
-
While my RDBMS experience is specifically with Microsoft SQL Server, from what I have gathered, nearly all of the popular RDBMS tools allow data to be arranged physically in different manners, like sorted on disk, partitioned across different disk drives, non-unique indexes applied, etc.
+
When creating a data model, there are usually two types of models created, a '''logical''' model and '''physical''' model. The logical model usually comes first and represents the implementation nonspecific data requirements, while the physical model covers the implementation specific details used to implement the database.
-
Realistically, no matter what term is used, there are three distinct models that you ought to consider:
+
-
• '''Logical''': Implementation nonspecific requirements for data storage
+
The term “physical”’s meaning in Codd’s Twelve Rules for a Relational Database Management System in Rule 8: Physical Data Independence. It basically states that the RDBMS can store the data in any way without changing the meaning or programmatic access. Physical has evolved to encompass the SQL objects that are used to implement the system as well.
-
'''Implementation''': RDBMS specific details needed to create a database that can actually store data
+
'''Logical'''
-
• '''Physical''': Hardware model has no client meaning, only where/how data is stored
+
The initial implementation nonspecific requirements are commonly documented here. The logical model is a highly structured document of what the database is attempting to do for the client. The goal is to get the requirements from the user/artifacts, and present a near perfection representation of the necessary data to solve the client’s needs.
-
'''Logical'''
+
Ideally the design will be void of details such as how SQL Server or Oracle would perform some task. Even things that cannot (or simply will not) be implemented in the data tier should be represented in the logical model.
-
The initial phase of implementation nonspecific requirements are commonly called the logical model. The goal is to get the requirements from the user/documentation, and end up with a relational database design that will represent near perfection. More or less the logical model is a highly structured document of what the database is attempting to do for the client. Ideally the design will be normalized, be void of implementation specific knowledge such as how SQL Server or Oracle would perform some task. The logical model is more or less documentation to the meaning of the client’s data structure needs.
+
-
'''Implementation'''
+
Naming is a big deal, and when naming tables and columns, the names are fully spelled out with little, if any, abbreviations and no concessions to the max length of names in the RDBMS. Tables, relationships, columns, etc should be defined as to their purposes.
-
The logical model directs programmers to create an RDBMS specific model. In this model, you adapt the logical model for implementation in the host RDBMS without dropping expected meaning. Exact datatypes, constraints, triggers, and whatever else may be needed to implement the logical model. This model should be reasonably void of performance details, like indexes to tune a query. This will be the job of the physical model.
+
 +
There are constructs that exist in the logical model in a manner that cannot be implemented directly. For example, a many-to-many relationship between two tables may be modeled, but cannot be implemented in a relational database without adding a table.
 +
'''Physical'''
'''Physical'''
-
Once you have the implementation model complete, data is being stored correctly, you will have a set of objects sitting in an RDBMS with little or no data. The physical model is all about adapting the implementation to a combination of users and hardware. This model will start to form by developers as they tune their programs, but in production, the DBA can change the physical model and no client will be affected in any manner other than performance. In an ideal situation, even moving a database to a new server, os, platform, could be a completely transparent action as long as the RDBMS stays the same.
 
-
In the end, having three distinct models for the “meaning”, the “implementation”, and the "physical” structures make the reality of the data architect, the programmers, and the job of the DBA much easier.
+
There are two sections of the physical model, the RDBMS specific model and the on-disk/performance tuning aspects. In the RDBMS section of the model, you adapt the logical model for implementation in the host RDBMS without dropping expected meaning. Exact types, constraints, triggers, and anything needed to implement the logical model. This model will contain everything that the programmer needs to build a database.
 +
 
 +
The second part of the model is about adapting the implementation to a combination of users and hardware. In production, the DBA can change the on-disk structures, and even moving the database to a new server, os, platform, should be a completely transparent action as long as the RDBMS stays the same.
 +
 
 +
By maintaining a logical, idealized model of what the system should represent, separate from the implementation details, the entire process of developing, testing, and maintaining databases throughout the entire lifecycle can be greatly improved.

Revision as of 06:17, 12 January 2010

A data model is a representation of data in a structured format. Everything about the data should be covered, from documentation to structure. A common misnomer is that the model is only a graphic, though this is not the case. A model usually has an associated picture, which is there for the picture is worth a thousand words effect.

When creating a data model, there are usually two types of models created, a logical model and physical model. The logical model usually comes first and represents the implementation nonspecific data requirements, while the physical model covers the implementation specific details used to implement the database.

The term “physical”’s meaning in Codd’s Twelve Rules for a Relational Database Management System in Rule 8: Physical Data Independence. It basically states that the RDBMS can store the data in any way without changing the meaning or programmatic access. Physical has evolved to encompass the SQL objects that are used to implement the system as well.

Logical

The initial implementation nonspecific requirements are commonly documented here. The logical model is a highly structured document of what the database is attempting to do for the client. The goal is to get the requirements from the user/artifacts, and present a near perfection representation of the necessary data to solve the client’s needs.

Ideally the design will be void of details such as how SQL Server or Oracle would perform some task. Even things that cannot (or simply will not) be implemented in the data tier should be represented in the logical model.

Naming is a big deal, and when naming tables and columns, the names are fully spelled out with little, if any, abbreviations and no concessions to the max length of names in the RDBMS. Tables, relationships, columns, etc should be defined as to their purposes.

There are constructs that exist in the logical model in a manner that cannot be implemented directly. For example, a many-to-many relationship between two tables may be modeled, but cannot be implemented in a relational database without adding a table.

Physical

There are two sections of the physical model, the RDBMS specific model and the on-disk/performance tuning aspects. In the RDBMS section of the model, you adapt the logical model for implementation in the host RDBMS without dropping expected meaning. Exact types, constraints, triggers, and anything needed to implement the logical model. This model will contain everything that the programmer needs to build a database.

The second part of the model is about adapting the implementation to a combination of users and hardware. In production, the DBA can change the on-disk structures, and even moving the database to a new server, os, platform, should be a completely transparent action as long as the RDBMS stays the same.

By maintaining a logical, idealized model of what the system should represent, separate from the implementation details, the entire process of developing, testing, and maintaining databases throughout the entire lifecycle can be greatly improved.

Personal tools