Opinions, Expertise, Passion.

Information in black and white, and sometimes some color.

Sep
17

Logical Data Modelling (LDM)

Link to this post

Following my User Group Presentation I was asked by fellow MySQLer Kim about Logical Data Modelling (LDM), in relation to Physical Data Modelling.

Well, first the brain had to work overtime to remember when was the last time I worked on a Logical Data Model. The answer to that is 1996 doing R&D work for Oracle Corporation with their CASE repository tool, Oracle Designer, about version 1.3/1.3.2. I’ve learnt in the past 10 years to purge technical stuff from my brain, leading from the capacity in be able to remember in detail data models, data migration and data cleansing issues of projects even after leaving them 3 years eariler.

As Kim pointed out, he thinks physically, actually directly at the SQL level, then working backwards to produce an appropiate physical model. To think logically is to consider the entities and attributes and relations before considering the physical tables, columns and relationships. So how do you program somebody to think logically? In the case with Kim, he is undertaking formal studies after already grasping the concepts of software development. Generally today more people don’t undertake the formal education and we end up with The Hobbist and the Professional Syndrome.

I guess in summary I’d argue why bother. Does anybody still do traditional logical models? Feedback/comments are welcome. (Professionally I believe there is a place for LDM).
However for the purpose of the exercise lets start with a Physical Representation and present a Logical Representation of that to see the differences.



From this you can see a classic Library example, showing a table of Books and a Table of Authors, and an intersection table (3NF) to indicate that a Book can have one to many Authors, and likewise an Author can have one to many Books.

So how would this have been represented in a Logical Model.



Some things to consider in Logical Modelling.

  • Attributes really only require a name, and perhaps just a datatype using number, string, date (Jim Starkey would be happy), but nothing specific like SMALLINT, or VARCHAR(50).
  • An Attributes Mandatory/Optional state may or may not be known
  • Information is not in 3NF. i.e. relationships can be many to many (in our example, there is no concept of the intersection table

I’m going from memory here, so there is probably more points to consider.

So how do you teach this when you are trying to work backwards, when I learnt this back in 1988 (some 18 years ago), I’d never created a database or used SQL so I didn’t have the history. Arjen had a brilliant idea to consider Logical Modelling after the fact. Views. Consider an end user requirement for Reporting, and how you would represent your model better to an end user (effectively de-normalising these views so users don’t have to know about joins).

In this case you would create a view of Books and a View of Authors. Details such as mandatory/optional isn’t important to the end user report (ie, it’s not like it’s needed to be enforced), and specific datatype details again are not that important. The basics to know how to format a number or a date works.

An interesting approach that worked well in our explaining.

The Text Book

So it was interesting to go back to the text book using C.J. Date’s An Introduction to Database Systems to review definitions.

Logically isn’t really referenced, the term used is conceptual. A conceptual definition is “The right way to do database design is to get the logical design right first, without paying any attention whatsoever at that stage to physical - that is, performance - considerations”

Also for example Third Normal Form we get from Section 12.3 First, Second and Third Normal Forms.

Third normal form (very informal definition): A relvar is in 3NF if and only if the monkey attributes (if any) are both:

  • a. Mutually independent
  • b. Irreducibly dependent on the primary key

We explain the terms nonkey attribute and mutually independend (loosely) as follows:

  • A nonkey attribute is any attribute that does not participate in the primary key of the relvar concerned.
  • Two or more attributes are mutually independent if none of them is functionally dependent on any combination of the others. Such independence implies that each attributes can be updated independently of the rest.

Man, no wonder many years of experience and having generally seen most cases, enables me to forget this and not feel like I’ve fogotten something.

Posted under Databases, General, MySQL on 17 Sep 2006

5 Comments »

  1. Hi!

    Great stuff! Care to add this content to the MySQL Forge for Getting Started in Database Design:

    http://forge.mysql.com/wiki/GettingStartedRelationalDatabaseConcepts

    Thanks!

    Jay

    Comment by Jay Pipes — September 19, 2006 @ 4:44 am

  2. Hi Ronald,

    Great to read see you bring this stuff to the attention.

    “I guess in summary I’d argue why bother. Does anybody still too traditional logical models?”

    Well, I do. I mean, in my mind at least. I almost never get to the stage of drawing an actually full blown diagram, but when I’m analyzing a problem, I think about the problem in these terms.

    To me, the essential part about logical modelling is that it is the phase where you make a (mental) inventory of what things the system is about. That inventory needs to do two things primarily:
    1) identify which classes of objects (entities) there are - at least principal types need to be established here.
    2) identify which kinds of relationships tie the entities together.

    It might include an account of which attributes there are, but at this point, you need not be complete. There are lot’s of things you don’t know -or care- about in this stage. Keys are another thing I don’t care about in this stage, although you will probably have some kind of idea of the properties that could be used to identify instances of the entity.

    Now one of the things I remember from the tool you describe, Oracle Designer, is that it was/is capable of transforming a logical datamodel into a physical one. That is IMO where things could go terribly wrong.

    It is sort of possible to do transformation from a logical model to a physical model, and most people will intuitively feel how this is implemented. Entities become tables, attributes become columns (and are assigned a datatype), the putative notion of identifying properties are transformed into (primary) keys, one to many relationships are translated to foreign keys and corresponding constraints and finally many to many relationships are used to generate intersection tables.

    Although such a generated physical model might give you a kind of headstart, you should probably not use it as is for the actual implementation. Entities might not map to tables: you might want to split up a table (vertical partitioning) for performance reasons, or because your analysis of the problem has changed while you’re developing (you might discover you need subtypes for example). Attributes don’t need to map one-to-one to columns: you might think of an entity with a timeperiod attribute, without already knowing the best way to implement it (date + int? from date to date? int and int?).

    Of course, a failure of such a translation does not devaluate logical modelling. IMO, you should use logical modelling primarily as a means to get a grasp on the problem, and also to communicate your information analysis to other people involved in the project. A physical model might take too much time to prepare in the early stages of development, or it might not serve the pupose of communication well enough. Of course, you migh even decide to let the choice for a particular rdbms be guided by the initial information analysis, in which case you simply can’t design the physical model at this stage. So, lot’s of reasons why to bother with logical models.

    So, just my thoughts on this interesting subject.

    Comment by Roland Bouman — September 19, 2006 @ 11:16 pm

  3. I make entity-relationship diagrams. I’m a big ol’ database geek. I find it’s much easier to start thinking of the real-world things I want to store, and their relationships to each other, and then work out how the db will do it. It keeps things much truer to reality.

    Comment by Sheeri — September 22, 2006 @ 11:52 am

  4. Some interesting info at Logical Data Modeling: A Key to Successful Enterprise Data Warehouse Implementations.

    Comment by Ronald — September 23, 2006 @ 11:24 am

  5. “A relvar is in 3NF if and only if the **monkey** attributes (if any) are both…”

    I’ve worked on a lot of systems in the last 25+ years and I must say a whole lot of them had MONKEY attributes. Also monkey project managers, come to that….

    Comment by Nigel — September 28, 2006 @ 5:31 pm

RSS feed for comments on this post.

Leave a comment

Home
Professional Blog RSS Feed of Professional Blog
Consulting
Presentations
About Ronald
Related Links
Contact Ronald
  • « Aug spinner iCalendar Oct »
    September 2006
    M T W T F S S
     123
    45678910
    11121314151617
    18192021222324
    252627282930 
  • Categories:
    • Professional
      • 42SQL
      • Apple
        • iPhone
        • MacBook
        • OS/X
      • Clever Design
      • Cloud Computing
        • 10gen
        • AppNexus
        • Kaavo
        • Kloudshare
      • Databases
        • Drizzle
        • Ingres
        • MySQL
          • Compiling
          • GUI Products
          • MySQL Events
            • mysqlcamp01
            • mysqlcamp02
          • MySQL Proxy
          • MySQL User Conferences
            • mysqluc06
            • mysqluc07
            • mysqluc08
          • Storage Engines
            • Non Transactional
              • Infobright
              • KickFire
              • Maria
              • Nitro
            • Transactional
              • Blob Streaming
              • Falcon
              • InnoDB
              • PBXT
              • Solid
        • Oracle
      • Extreme Programming (XP)
      • General
      • Java
        • Tomcat
      • Linux
        • One Liners
      • Microsoft
      • Open Source
        • Buildbot
        • Ubuntu
        • UltimateLAMP
        • Virtual Box
      • OSCON 2008
      • Packet General
      • PrimeBase Technologies
      • Solid State Drives
      • Sun
      • The Daily WTF
      • Web 2.0 NY
      • Windoze
      • Yahoo
    • Web
      • Google
        • App Engine
        • Summer of Code
      • SEO
        • Brand Identity
      • Web Development
        • Amazon
          • EC2
          • S3
          • SimpleDB
        • CSS
        • HTML
        • PHP
        • Web 2.0
      • Web Sites
        • Application Software
        • Content
        • Cool Tools
        • Linux Stuff
        • MySQL Related
        • Show Your Stuff
        • Twitter
        • Unype
      • WordPress
  • Pages:
    • Best Of PlanetMySQL Articles
    • Interesting Articles
    • MediaWiki Restyling (1)

  • Archives:
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
    • October 2007
    • September 2007
    • August 2007
    • July 2007
    • June 2007
    • May 2007
    • April 2007
    • March 2007
    • February 2007
    • January 2007
    • December 2006
    • November 2006
    • October 2006
    • September 2006
    • August 2006
    • July 2006
    • June 2006
    • May 2006
    • April 2006
    • March 2006
    • February 2006
    • January 2006
    • December 2005
    • November 2005
    • October 2005
    • September 2005
    • July 2005
    • June 2005
    • February 2005
    • October 2004
    • September 2004
    • July 2004
    • June 2004