Update: the Second Post of this series is now available.
Update: the Third Post of this series is now available.
The Wikipedia entry on "Business Logic" has a wonderfully honest opening sentence stating that "Business logic, or domain logic, is a non-technical term... (emphasis mine)". If this is true, that the term is non-technical, or if you like, non-rigorous, then most of us spend the better part of our efforts working on something that does not even have a definition. Kind of scary.
Is it possible to come up with a decent working definition of business logic? It is certainly worth a try. This post is the first in a four part series. The second post is about a more rigorous definition of Business Logic.
This blog has two tables of contents, the Complete Table of Contents and the list of Database Skills.
The Method
In this essay we will pursue a method of finding operations that we can define as business logic with a minimum of controversey, and identify those that can likely be excluded with a minimum of controversey. This may leave a bit of gray area that can be taken up in a later post.
An Easy Exclusion: Presentation Requirements
If we define Presentation Requirements as all requirements about "how it looks" as opposed to "what it is", then we can rule these out. But if we want to be rigorous we have to be clear, Presentation Requirements has to mean things like branding, skinning, accessibility, any and all formatting, and anything else that is about the appearance and not about the actual values fetched from somewhere.
Tables as the Foundation Layer
Database veterans are likely to agree that your table schema constitutes the foundation layer of all business rules. The schema, being the tables, columns, and keys, determines what must be provided and what must be excluded. If these are not business logic, I guess I don't know what is.
What about CouchDB and MongoDB and others that do not require a predefined schema? These systems give up the advantages of a fixed schema for scalability and simplicity. I would argue here that the schema has not disappeared, it has simply moved into the code that writes documents to the database. Unless the programmer wants a nightmare of chaos and confusion, he will enforce some document structure in the code, and so I still think it safe to say that even for these databases there is a schema somewhere that governs what must be stored and what must be excluded.
So we have at least a foundation for a rigorous definition of business rules: the schema, be it enforced by the database itself or by the code, forms the bottom layer of the business logic.
Processes are Business Logic
The next easy addition to our definition of business logic would be processes, where a process can be defined loosely as anything involving multiple statements, can run without user interaction, may depend on parameters tables, and may take longer than a user is willing to wait, requiring background processing.
I am sure we can all agree this is business logic, but as long as we are trying to be rigorous, we might say it is business logic because:
- It must be coded
- The algorithm(s) must be inferred from the requirements
- It is entirely independent of Presentation Requirements
Calculations are Business Logic
We also should be able to agree that calculated values like an order total, and the total after tax and freight, are business logic. These are things we must code for to take user-supplied values and complete some picture.
The reasons are the same as for processes, they must be coded, the formulas must often be inferred from requirements (or forced out of The Explainer at gunpoint), and the formulas are completely independent of Presentation Requirements.
The Score So Far
So far we have excluded "mere" Presentation Requirements, and included three entries I hope so far are non-controversial:
- Schema
- Processes
- Calculations
These are three things that some programmer must design and code. The schema, either in a conventional relational database or in application code. Processes, which definitely must be coded, and calculations, which also have to be coded.
What Have We Left Out?
Plenty. At very least security and notifications. But let's put those off for another day and see how we might handle what we have so far.
For the Schema, I have already mentioned that you can either put it into a Relational database or manage it in application code when using a "NoSQL" database. More than that will have to wait for 2011, when I am hoping to run a series detailing different ways to implement schemas. I'm kind of excited to play around with CouchDB or MongoDB.
For processes, I have a separate post that examines the implications of the stored procedure route, the embedded SQL route, and the ORM route.
This leaves calculations. Let us now see how we might handle calculations.
Mixing CRUD and Processes
But before we get to CRUD, I should state that if your CRUD code involves processes, seek professional help immediately. Mixing processes into CRUD is an extremely common design error, and it can be devastating. It can be recognized when somebody says, "Yes, but when the salesman closes the sale we have to pick this up and move it over there, and then we have to...."
Alas, this post is running long already and so I cannot go into exactly how to solve these, but the solution will always be one of these:
- Spawning a background job to run the process asynchronously. Easy because you don't have to recode much, but highly suspicous.
- Examining why it seems necessary to do so much work on what ought to be a single INSERT into a sales table, with perhaps a few extra rows with some details. Much the better solution, but often very hard to see without a second pair of eyes to help you out.
So now we can move on to pure CRUD operations.
Let The Arguments Begin: Outbound CRUD
Outbound CRUD is any application code that grabs data from the database and passes it up to the Presentation layer.
A fully normalized database will, in appropriate cases, require business logic of the calculations variety, otherwise the display is not complete and meaningful to the user. There is really no getting around it in those cases.
However, a database Denormalized With Calculated Values requires no business logic for outbound CRUD, it only has to pick up what is asked for and pass it up. This is the route I prefer myself.
Deciding whether or not to include denormalized calculated values has heavy implications for the architecture of your system, but before we see why, we have to look at inbound CRUD.
Inbound CRUD
Inbound CRUD, in terms of business logic, is the mirror image of outbound. If your database is fully normalized, inbound CRUD should be free of business logic, since it is simply taking requests and pushing them to the database. However, if you are denormalizing by adding derived values, then it has to be done on the way in, so inbound CRUD code must contain business logic code of the calculations variety.
Now let us examine how the normalization question affects system architecture and application code.
Broken Symmetry
As stated above, denormalizing by including derived values forces calculated business logic on the inbound path, but frees your outbound path to be the "fast lane". The opposite decision, not storing calculated values, allows the inbound path to be the "fast lane" and forces the calculations into the outbound path.
The important conclusion is: if you have business logic of the calculation variety in both lanes then you may have some inconsistent practices, and there may be some gain involved in sorting those out.
But the two paths are not perfectly symmetric. Even a fully normalized database will often, sooner or later, commit those calculated values to columns. This usually happens when some definition of finality is met. Therefore, since the inbound path is more likely to contain calculations in any case, the two options are not really balanced. This is one reason why I prefer to store the calculated values and get them right on the way in.
One Final Option
When people ask me if I prefer to put business logic in the server, it is hard to answer without a lot of information about context. But when calculations are involved the answer is yes.
The reason is that calculations are incredibly easy to fit into patterns. The patterns themselves (almost) all follow foreign keys, since the foreign key is the only way to correctly relate data between tables. So you have the "FETCH" pattern, where a price is fetched from the items table to the cart, the "EXTEND" pattern, where qty * price = extended_Price, and various "AGGREGATE" patterns, where totals are summed up to the invoice. There are others, but it is surprising how many calculations fall into these patterns.
Because these patterns are so easy to identify, it is actually conceivable to code triggers by hand to do them, but being an incurable toolmaker, I prefer to have a code generator put them together out of a data dictionary. More on that around the first of the year.
Updates
Update 1: I realize I never made it quite clear that this is part 1, as the discussion so far seems reasonable but is hardly rigorous (yet). Part 2 will be on the way after I've fattened up for the holidays.
Update 2: It is well worth following the link Mr. Koppelaars has put in the comments: http://thehelsinkideclaration.blogspot.com/2009/03/window-on-data-applications.html