Following my User Group Presentation I was asked by fellow MySQLer Kim about Logical Data Modelling (LDM), in relation to Physical Data Modelling.
Well, first the brain had to work overtime to remember when was the last time I worked on a Logical Data Model. The answer to that is 1996 doing R&D work for Oracle Corporation with their CASE repository tool, Oracle Designer, about version 1.3/1.3.2. I’ve learnt in the past 10 years to purge technical stuff from my brain, leading from the capacity in be able to remember in detail data models, data migration and data cleansing issues of projects even after leaving them 3 years eariler.
As Kim pointed out, he thinks physically, actually directly at the SQL level, then working backwards to produce an appropiate physical model. To think logically is to consider the entities and attributes and relations before considering the physical tables, columns and relationships. So how do you program somebody to think logically? In the case with Kim, he is undertaking formal studies after already grasping the concepts of software development. Generally today more people don’t undertake the formal education and we end up with The Hobbist and the Professional Syndrome.
I guess in summary I’d argue why bother. Does anybody still do traditional logical models? Feedback/comments are welcome. (Professionally I believe there is a place for LDM).
However for the purpose of the exercise lets start with a Physical Representation and present a Logical Representation of that to see the differences.
From this you can see a classic Library example, showing a table of Books and a Table of Authors, and an intersection table (3NF) to indicate that a Book can have one to many Authors, and likewise an Author can have one to many Books.
So how would this have been represented in a Logical Model.
Some things to consider in Logical Modelling.
- Attributes really only require a name, and perhaps just a datatype using number, string, date (Jim Starkey would be happy), but nothing specific like SMALLINT, or VARCHAR(50).
- An Attributes Mandatory/Optional state may or may not be known
- Information is not in 3NF. i.e. relationships can be many to many (in our example, there is no concept of the intersection table
I’m going from memory here, so there is probably more points to consider.
So how do you teach this when you are trying to work backwards, when I learnt this back in 1988 (some 18 years ago), I’d never created a database or used SQL so I didn’t have the history. Arjen had a brilliant idea to consider Logical Modelling after the fact. Views. Consider an end user requirement for Reporting, and how you would represent your model better to an end user (effectively de-normalising these views so users don’t have to know about joins).
In this case you would create a view of Books and a View of Authors. Details such as mandatory/optional isn’t important to the end user report (ie, it’s not like it’s needed to be enforced), and specific datatype details again are not that important. The basics to know how to format a number or a date works.
An interesting approach that worked well in our explaining.
The Text Book
So it was interesting to go back to the text book using C.J. Date’s An Introduction to Database Systems to review definitions.
Logically isn’t really referenced, the term used is conceptual. A conceptual definition is “The right way to do database design is to get the logical design right first, without paying any attention whatsoever at that stage to physical – that is, performance – considerations”
Also for example Third Normal Form we get from Section 12.3 First, Second and Third Normal Forms.
Third normal form (very informal definition): A relvar is in 3NF if and only if the monkey attributes (if any) are both:
- a. Mutually independent
- b. Irreducibly dependent on the primary key
We explain the terms nonkey attribute and mutually independend (loosely) as follows:
- A nonkey attribute is any attribute that does not participate in the primary key of the relvar concerned.
- Two or more attributes are mutually independent if none of them is functionally dependent on any combination of the others. Such independence implies that each attributes can be updated independently of the rest.
Man, no wonder many years of experience and having generally seen most cases, enables me to forget this and not feel like I’ve fogotten something.