Friday, November 19, 2010

Prepare Now For Possible Future Head Transplant

This is the Database Programmer blog, for anybody who wants practical advice on database use.

There are links to other essays at the bottom of this post.

This blog has two tables of contents, the Topical Table of Contents and the list of Database Skills.

Planning For The Unlikely

We programmers love to plan for things that will hardly ever happen, like coding the system's upgrade engine to handle spontaneous human combustion, making sure the SQL scrubbing layer can also launch a rocket into space, and, well, trying to work out ahead of time what to do if we ever need a head transplant.

The boss comes over and says, "can you toss a simple plot onto the sales' staff home page that shows sales by day? Use that 'jquicky' or whatever you call it. Should take a couple of hours, right?" And three days later we're working on the world's greatest plotting system that can report everything except what the boss actually asked for because we haven't gotten around to that part of it yet. (Really, can he expect me to just bang this out without the required Seven Holy Layers of Abstraction and Five Ritual Forms of Parameterization, and the Just and Wholesome Mobile Support, or the features Not Yet Required but visible to the Far Seeing Wise and Learned Men?)

Abstraction Contraptions

So what I am getting at is that programmers of all stripes are addicted to abstraction, it gives us goosebumps and makes us feel warm and tingly, and so we do it even when we do not need to. We build abstraction contraptions.

When it comes to designing a database, this unhealthy proclivity can seriously slow you down, because of what I call:

Ken's Law

Everybody wants to be remembered for something. If I could write my own epitaph, it might be:

Table-based datastores are optimally abstract

This law is not about database access, it is about database design. It can be expressed informally as:

People Understand Tables Just Fine

Or more rigorously as:

Table-based datastores are optimally abstract; they require no additional abstraction when requirements are converted to desgin; they cannot be reduced to a less abstract form.

Structured Atomic Values

I should point out that this essay deals with structured atomic values, who live in the Kingdom of The Relational Database. The concepts discussed here do not apply to free-text documents or images, or sound files, or any other media.

No Additional Abstraction Required

My basic claim here is that you cannot create an abstraction of data schemas that will pay for itself. At best you will create a description of a database where everything has been given a different name, where tables have been designated 'jingdabs' and columns have been designated 'floopies' and in the end all of your jingdab floopies will become columns in tables. Oh, and I suppose I should mention the Kwengars will be foreign keys and the Slerzies will be primary keys.

After that it goes downhill, because if we generate an abstraction that is not a simple one-to-one mapping, we actually obscure the end goal. Consider an example so simple as to border on the trivial or absured. Why would we ever use the terms "One-to-Many" or "Many-to-Many" when the more precise terms "child table" and "cross-reference table" convey the same idea without the noise? I said above that this would sound trivial, and you can accuse me of nit-picking, but this is one of those camel's nose things, or perhaps a slippery slope. When technical folk get together to design a system, we should call things what they are, and not make up new words to make ourselves sound smarter.

No de-Abstraction is Possible

The second half of Ken's law says that you cannot de-Abstract a table schema into some simpler form. This should be very easy to understand, because relational databases deal with atomic values, that is, values which cannot themselves be decomposed. If you cannot decompose something, then it cannot be an abstraction of something more specific.

Going further, if the schema has been normalized, then every fact is stored in exactly one place, so no further simplification is possible. If you cannot simplify it or resolve it into something more specific, then it is not an abstraction of something else.

But Does it Work?

I originally began to suspect the existence of what I call in all humility "Ken's Law" when I was sitting at a large conference table with my boss, her boss, a couple of peers, and 3 or 4 reps from a Fortune 5 company. My job was basically to be C3PO, human-cyborg relations. Some people at the table protested loudly to being non-technical, while others had technical roles. But everybody at the table spent all day discussing table design.

Later on, when at a different position, the programmers received their instructions from Project Managers. The best Project Managers worked with their customers to figure out what they were trying to keep track of, and handed us specs that were basically table layouts. The customers loved these guys because they felt they could "understand what the project manager was talking about", and the project managers, who swore they were not technical, were respected because they handed us requirements we could actually understand and implement.

Since that time I have further learned that it is very easy for anybody who deals with non-technical people to bring them directly to table design without telling them you are doing it. All you have to do is listen to what words they use and adopt your conversation accordingly. If they say things like "I need a screen that shows me orders by customer types" they have told you there will be a table of customer types. Talk to them in terms of screens. If they say, "Our catalog has 3 different price list and four discount schemes" then you know that there will be a PRICELIST table, a DISCOUNTS table, and likely some cross-references and parent-child relationships going on here.

So How Does ORM Come Into This?

One of the greatest abstraction contraptions of this century (so far), is ORM, or Object-Relational Mapping, which I do not use precisely because it is an abstraction contraption.

To be clear, the mistake that ORM makes is not at the design phase, but at the access phase. The ORM meme complex instructs its victims that it is ok to put structured atomic values into a Relational Database, but when it comes time to access and use that data we will pretend we did not put it into tables and we will pretend that the data is in objects. In this sense the term Object- Relational Mapping is a complete misnomer, because the point is not to map data to objects but to create the illusion that the tables do not even exist.

Perhaps ORM should stand for Obscuring Reality Machine.

Getting Back to That Head Transplant

So what about that weird title involving head transplants? Obviously a head transplant is impossible, making it also very unlikely, besides being silly and non-sensical. It came to mind as a kind of aggregrate of all of the bizarre and unrealistic ideas about abstracting data designs that I have heard over the years.

One of these ideas is that it is possible and beneficial to create a design that is abstract so that it can be implemented in any model: relational, hierarchical, or network. I'm not saying such a thing is impossible, it is likely just a small matter of programming, but for heaven's sake, what's the point?

Conclusion

So don't waste time creating abstractions that add steps, possibly obscure the goal, and add no value. Don't plan for things that are not likely to happen, and avoid abstraction contraptions.

Related Essays

This blog has two tables of contents, the Topical Table of Contents and the list of Database Skills.

Other philosophy essays are:

2 comments:

dogus said...

hello, i am quiet confused about this, what exactly does this whole article state out?? can you please reply back to me.

KenDowns said...

dogus: um, I realized after posting it that it does not make its case very well. I'm going to rewrite it sometime in the next couple of weeks.

But to answer your question directly: people work very well with tabular data, it seems to be a natural for the human mind. Very often we reduce our own productivity by trying to discuss database systems while avoiding direct discussion of tables. I find you can easily discuss tables with non-technical people, and perform analysis and design at the table level.