Welcome to the Database Programmer! Every Monday morning this blog contains a new essay. This blog is for people who want to learn the practical realities of databases. Topics range from simple to advanced.
The main theme of these essays is that your applications will be leaner, faster, and easier to write and maintain if you understand how databases work.
The Resolution Pattern
A resolution is an interesting database pattern because it involves both table design and a very specific form of SELECT statement. Just getting the tables down is not enough, you have to know how to write the SELECT that will return the correct value.
Basic Description and Example
A resolution pattern occurs when you need a value and there is more than one place where it might be. As an example, consider the case of a computer services shop that provides complete IT services, including programming. In their billing system, they have a simple table that lists the rates for their various activities.
ACTIVITY | RATE ----------+------- ITGENERAL | 100 PROJMGT | 200 SOFTWARE | 150
This is simple enough, but now suppose that you have a particular employee that you bill out at $175.00/hour for software development. This makes the picture a little more complicated. But suppose that it gets more complicated, suppose that you enter into an arrangement with a particular customer to do volume software development for them for $135.00. And just to make it interesting, suppose you have a very specific arrangement with a particular customer to provide the services of a particular employee for $185.00 for project management.
With this many possible billing arrangements, your super-simple invoicing program is suddenly not so simple. On any particular invoice line, you must resolve the actual hourly billing rate out of several possibilities. Because you must resolve the value, this pattern is called a resolution.
Precise Description of the Resolution Pattern
A resolution pattern has these characteristics:
- The goal of a resolution is to find a particular value. In our example this is a billing rate.
- Resolutions examine multiple possible values and pick the first match according to precedence.
- Precedence usually begins with the most specific and falls back to the most general. In our example the most specific possible rate is defined for a customer-activity-employee, while the most general is the default rate for an activity.
Resolutions are not always easy to recognize. Mostly this is because customers do not tell you "we have a resolution." Instead they tell you they have a billing rate. The explanation of the special overrides for employees comes in a different conversation, and perhaps to a different member of your team. Then later comes the explanation of the other overrides. The resolution only becomes apparent when the various requirements are all sorted out and put next to each other. Then somebody says, "Hey, there are four different formulas for the billing rate!" Then you know you have a resolution.
The Table Design
A resolution requires one table for each possible level of detail where a value might be supplied. In our example there will be a table for:
- Rates by activity-customer-employee
- Rates by activity-customer
- Rates by activity-employee
- Final default values by activity
These table only contain values when they are relevant. The table of activity-customer does not contain a row for every possible combination of activities and customers, it only contains a row when there has been some agreement to provide an activity to a specific customer for a special rate.
Here are the tables:
ACTIVITY | CUSTOMER | RATE ---------+----------+------ PROJMGT | PRAXIS | 225 SOFTWARE | PRAXIS | 235 ACTIVITY | EMPLOYEE | RATE ---------+----------+------ PROJMGT | SRUSSEL | 225 ACTIVITY | EMPLOYEE | CUSTOMER | RATE ---------+----------+----------+------ PROJMGT | HIROKO | PRAXIS | 250
Resolving In Client Code Will Kill Performance
Now consider that the there is a table somewhere that is used to drive billing. Maybe the employees themselves record their time in this table, or maybe some clerical staff member is entering them. Whoever puts them in, each record has an activity, an employee, and a customer (and of course hours). You need to write a program that finds the correct billing rate for each row.
A die-hard code grinder will do all of this in the client. He will write a query to pull all of the rows from the time entry table. Then he will loop through these rows. For each line he will query the server for a the most detailed value, activity-employee-customer. If it is not found he will do a second query for the next table in line, and so forth. This will be a performance disaster because his program will be making a huge number of round trips to the server. If he understood the LEFT JOIN he would need only one trip to the server.
First Stab with A LEFT JOIN
Here is a query that does most of what we need for the resolution:
SELECT ol.activity,ol.employee,ol.customer ,aec.rate as aec_rate ,ac.rate as ac_rate ,ae.rate as ae_rate ,a.rate FROM orderlines ol LEFT JOIN act_emp_cust_rates aec ON ol.activity = aec.activity AND ol.customer = aec.customer AND ol.employee = aec.employee LEFT JOIN act_cust_rates ac ON ol.activity = ae.activity AND ol.customer = ae.customer LEFT JOIN act_emp_rates ae ON ol.activity = aec.activity AND ol.employee = aec.employee JOIN activities a ON ol.activity = a.activity WHERE (....relevant search conditions....)
The LEFT JOIN tells the server to return all matching rows from the orderlines table, even if there is no match in the various override tables. The above query will return something like this:
ACTIVITY | EMPLOYEE | CUSTOMER | AEC_RATE | AC_RATE | AE_RATE | RATE ---------+----------+----------+----------+---------+---------+------ PROJMGT | HIROKO | PRAXIS | 250 | null | null | 200 PROJMGT | NIRGAL | PRAXIS | null | 225 | null | 200 SOFTWARE | SRUSSEL | PRAXIS | null | 235 | null | 150 PROJMGT | SRUSSEL | GE | null | null | 225 | 200 PROJMGT | SRUSSEL | NASA | null | null | 225 | 200 SOFTWARE | HIROKO | PRAXIS | null | 235 | null | 150 SOFTWARE | HIROKO | GE | null | null | null | 150
The Final Form of the Query
The first form of the query returns all four possible rates, and the effect of a LEFT JOIN is to have a NULL value where there was no match on the right side.
We can do better than this and return the actual rate by using a COALESCE function. A COALESCE allows us to list two or more values, and the function returns the first one that is not null. This lets us return the actual resolved value from the server:
SELECT ol.activity,ol.employee,ol.customer ,COALESCE(aec.rate,ac.rate,ae.rate,a.rate) as rate FROM orderlines ol LEFT JOIN act_emp_cust_rates aec ON ol.activity = aec.activity AND ol.customer = aec.customer AND ol.employee = aec.employee LEFT JOIN act_cust_rates ac ON ol.activity = ae.activity AND ol.customer = ae.customer LEFT JOIN act_emp_rates ae ON ol.activity = aec.activity AND ol.employee = aec.employee JOIN activities a ON ol.activity = a.activity WHERE (....relevant search conditions....)
...which gives us the complete answer:
ACTIVITY | EMPLOYEE | CUSTOMER | RATE ---------+----------+----------+------ PROJMGT | HIROKO | PRAXIS | 250 PROJMGT | NIRGAL | PRAXIS | 225 SOFTWARE | SRUSSEL | PRAXIS | 235 PROJMGT | SRUSSEL | GE | 225 PROJMGT | SRUSSEL | NASA | 225 SOFTWARE | HIROKO | PRAXIS | 235 SOFTWARE | HIROKO | GE | 150
Conclusion: Tables and Queries Go Together
We have seen this week our first table design pattern that requires a certain form of query. This shows us as well that queries themselves will fall into patterns, and we will definitely see more of these patterns in future essays.