Saturday, February 14, 2009

A Comprehensive Database Security Model

This week I am taking a bit of a departure. Normally I write about things I have already done, but this week I want to speculate a bit on a security model I am thinking of coding up. Basically I have been asking myself how to create a security model for database apps that never requires elevated privileges for code, but still allows for hosts sharing multiple applications, full table security including row level and column level security, and structural immunity to SQL injection.

The Functional Requirements

Let's consider a developer who will be hosting multiple database applications on a server, sometimes instances of the same application for different customers. The applications themselves will have different needs, but they all boil down to this:

  • Some applications will allow surfers to join the site and create accounts for themselves, while others will be private sites where an administrator must make user accounts.
  • Some applications will not contain sensitive data, and so the site owner wants to send forgotten passwords in email -- which means the passwords must be stored in plaintext. Other site owners will need heightened security that disallows storing of passwords in plaintext.
  • In both cases, administrators must of course be able to manage accounts themselves.
  • The system should be structurally immune to SQL injection.
  • It must be possible to have users with the same user id ("Sheilia", "John", etc.) on multiple applications who are actually totally different people.
  • The application code must never need to run at an elevated privelege level for any reason -- not even to create accounts on public sites where users can join up and conduct transactions.
  • It must be possible for the site owners or their agents to directly connect to the database at very least for querying and possibly to do database writes without going through our application.
  • Users with accounts on one app must never be able to sign on to another app on the same server.

These requirements represent the most flexible possible combination of demands that I have so far seen in real life. The question is, can they be met while still providing security? The model I'd like to speculate on today says yes.

Informed Paranoia Versus Frightened Ignorance

Even the most naive programmer knows that the internet is not a safe place, but all too often a lot of security advice you find is based on frightened ignorance and takes the form, "never do x, you don't know what might happen." If we are to create a strong security model, we have to do better than this.

Much better is to strive to be like a strong system architect, whose approach is based on informed paranoia. This hypothetical architect knows everybody is out to compromise his system, but he seeks a thorough knowledge of the inner workings of his tools so that he can engineer the vulnerabilities out as much as possible. He is not looking to write rules for the programmer that say "never do this", he is rather looking to make it impossible for the user or programmer to compromise the system.

Two Examples

Let us consider a server hosting two applications, which are called "social" and "finance".

The "social" application is a social networking site with minimal security needs. Most important is that the site owners want members of the general public to sign up, and they want to be able to email forgotten passwords (and we can't talk them out of it) -- so we have to store passwords in plaintext.

The "finance" application is a private site used by employees of a corporation around the world. The general public is absolutely not welcome. To make matters worse however, the corporation's IT department demands to be able to directly connect to the database and write to the database without going through the web app. This means the server will have an open port to the database. Sure it will be protected with SSL and passwords, but we must make sure that only users of "finance" can connect, and only to their own application.

Dispensing With Single Sign-On

There are two ways to handle connections to a database. One model is to give users real database accounts, the other is to use a single account to sign on to the database. Prior to the web coming along, there were proponents of both models in the client/server world, but amongst web developers the single sign-on method is so prevalent that I often wonder if they know there is any other way to do it.

Nevertheless, we must dispense with the single sign-on method at the start, regardless of how many people think that Moses carved it on the third tablet, because it just has too many problems:

  • Single Sign-on is the primary architectural flaw that makes SQL injection possible. As we will see later, using real database accounts makes your site (almost) completely immune to SQL injection.
  • Single Sign-on requires a connection at the maximum privilege level that any system user might have, where the code then decides what it will let a particular user do. This is a complete violation of the requirement that code always run at the lowest possible privilege level.
  • Single Sign-on totally prevents the requirement that authorized agents be allowed to connect to the database and directly read and write values.

So single sign-on just won't work with the requirements listed. This leads us to creating real accounts on the database server.

Real Accounts and Basic Security

When you use a real database account, your code connects to the database using the username and password provided by the user. Anything he is allowed to do your code will be allowed to do, and anything he is not allowed to do will throw and error if your code tries to do it.

This approach meets quite a few of our requirements nicely. A site owner's IT department can connect with the same accounts they use on the web interface -- they have the same privileges in both cases. Also, there is no need to ever have application code elevate its privilege level during normal operations, since no regular users should ever be doing that. This still leaves the issue of how to create accounts, but we will see that below.

A programmer who thinks of security in terms of what code can run will have a very hard time wrapping his head around using real database accounts for public users. The trick to understanding this approach is to forget about code for a minute and to think about tables. The basic fact of database application security is that all security resolves to table permissions. In other words, our security model is all about who can read or write to what tables, it is not about who can run which program.

If we grant public users real database accounts, and they connect with those accounts, the security must be handled within the database itself, and it comes down to:

  • Defining "groups" as collections of users who share permissions at the table level.
  • Deciding which groups are allowed select, insert, update, and delete privileges on which tables.
  • Granting and revoking those privileges on the server itself when the database is built.
  • At very least row-level security will be required, wherein a user can only see and manipulate certain rows in a table. This is how you keep users from using SQL Injection to mess with each other's order history or member profiles.
  • Column security is also very nice to finish off the picture, but we will not be talking about that today as it does not play into the requirements.

Now we can spend a moment and see why this approach eliminates most SQL Injection vulnerabilities. We will imagine a table of important information called SUPERSECRETS. If somebody could slip in a SQL injection exploit and wipe out this table we'd all go to jail, so we absolutely cannot allow this. Naturally, most users would have no privileges on this table -- even though they are directly connected to the database they cannot even see the table exists, let alone delete from it. So if our hypothetical black hat somehow slips in ";delete from supersecrets" and our code fails to trap for it, nothing happens. They have no privlege on that table. On the other side of things, consider the user who is privileged to delete from that table. If this user slips in a ";delete from supersecrets" he is only going to the trouble with SQL Injection to do something he is perfectly welcome to do anyway through the user interface. So much for SQL injection.

To repeat a point made above: row-level security is a must. If you grant members of a social site global UPDATE privileges on the PROFILES table, and you fail to prevent a SQL Injection, all hell could break loose. Much better is the ability to limit the user to seeing only his own row in the PROFILE table, so that once again you have created a structural immunity to SQL injection.

Anonymous Access

Many public sites allow users to see all kinds of information when they are not logged on. The most obvious example would be an eCommerce site that needs read access to the ITEMS table, among others. Some type of anonymous access must be allowed by our hypothetical framework.

For our two examples, the "social" site might allow limited viewing of member profiles, while the "finance" application must show absolutely nothing to the general public.

If we want a general solution that fits both cases, we opt for a deny-by-default model and allow each application to optionally have an anonymous account.

First we consider deny-by-default. This means simply that our databases are always built so that no group has any permissions on any tables. The programmer of the "social" site now has to grant certain permissions to the anonymous account, while the programmer of the "finance" application does nothing - he already has a secure system.

But still the "finance" site is not quite so simple. An anonymous user account with no privileges can still log in, and that should make any informed paranoid architect nervous. We should extend the deny-by-default philosophy so the framework will not create an anonymous account unless requested. This way the programmer of the "finance" application still basically does nothing, while the programmer of the "social" must flip a flag to create the anonymous account.

Virtualizing Users

If we are having real database accounts, there is one small detail that has to be addressed. If the "social" site has a user "johnsmith" and the finance application has a user of the same name, but they are totally different people, we have to let both accounts exist but be totally separate.

The answer here is to alias the accounts. The database server would actually have accounts "finance_johnsmith" and "social_johnsmith". Our login process would simply take the username provided and append the code in front of it when authenticating on the server. 'nuf said on that.

Allowing Public Users To Join

The "social" site allows anybody to join up and create an account. This means that somehow the web application must be able to create accounts on the database server. Yet it must do this without allowing the web code to elevate its privileges, and while preventing the disaster that would ensue if a user on the "social" site somehow got himself an account on the "finance" site.

Believe it or not, this is the easy part! Here is how it works for the "social" site:

  • Create a table of users. The primary key is the user_id which prevents duplication.
  • For the social site, there is a column called PASSWORD that stores the password in plaintext.
  • Allow the anonymous account to INSERT into this table! (Remember though that deny-by-default means that so far this account has no other privileges).
  • Put an INSERT trigger on the table that automatically creates an aliased user account, so that "johnsmith" becomes "social_johnsmith". The trigger also sets the password.
  • A DELETE trigger on the table would delete users if the row is deleted.
  • An UPDATE trigger on the table would update the password if the user UPDATES the table.
  • Row level security is an absolute must. Users must be able to SELECT and UPDATE table, but only their own row. If your database server or framework cannot support row-level security, it's all out the window.

This gives us a system that almost gets us where we need to be: the general public can create acounts, the web application does not need to elevate its privileges, users can set and change their passwords, and no user can see or set anything for any other user. However, this leaves the issue of password recovery.

In order to recover passwords and email them to members of the "social" site, it is tempting to think that the anonymous account must be able to somehow read the users table, but that is no good because then we have a structural flaw where a successful SQL injection would expose user accounts. However, this also turns out to be easy. There are two options:

  • Write a stored procedure that the anonymous user is free to execute, which does not return a password but actually emails it directly from within the database server. This requires your database server be able to send emails. (Postgres can, and I assume SQL Server can, and I don't really know about mySql).
  • Create a table for password requests, allow inserts to it but nothing else. A trigger sends the email. In this approach you can track email recovery requests.

For the "finance" application we cannot allow any of this to happen, so again we go to the deny-by-default idea. All of the behaviors above will not happen unless the programmer sets a flag to turn them on when the database is built.

This does leave the detail of how users of the "finance" application will reset their passwords. For details on how a secure app can still allow password resets, see my posting of Sept 7 2008 Secure Password Resets.

One More Detail on Public Users

We still have one more detail to handle for public users. Presumably a user, having joined up, has more privileges than the anonymous account. So the web application must be able to join them into a group without elevating its privileges. The solution here is the same as for creating the account: there will be a table that the anonymous user can make inserts into (but nothing else), and a trigger will join the user to whatever group is named.

Except for one more detail. We cannot let the user join whatever group they want, only the special group for members. This requirement can be met by defining the idea of a "freejoin" group and also a "solo" group. If the anonymous user inserts into a user-group table, and the requested group is flagged as allowing anybody to join, the trigger will allow it, but for any other group the trigger will reject the insert. The "solo" idea is similar, it means that if a user is in the "members" group, and that group is a "solo" group, they may not join any other groups. This further jails in members of the general public.

Almost Done: User Administration

In the last two sections we saw the idea of a table of users and a cross-reference of users to groups. This turns out to solve another issue we will have: letting administrators manage groups. If we define a group called "user_administrators" and give them total power on these tables, and also give them CRUD screens for them, then we have a user administrator system. This works for both the "social" and the "finance" application.

The triggers on the table have to be slightly different for the two cases, but that is a small exercise to code them up accordingly.

Cross-Database Access

Believe it or not, the system outlined above has met all of our requirements except one. So far we have a system that never requires the web server to have any elevated priveleges within the database, allows members of the public to join some sites while barring them from others, is structurally immune from SQL injection, allows different people on different sites to have the same user id, and allows administrators of both sites to directly manage accounts. Moreover, we can handle both plaintext passwords and more serious reset-only situations.

This leaves only one very thorny issue: cross-database access. The specific database server I use most is PostgreSQL, and this server has a problem (for this scenario) anyway, which is that out-of-the-box, a database account can connect to any database. This does not mean the account has any priveleges on the database, but we very seriously do not want this to happen at all. If a member of the "social" site can connect to the "finance" app, we have a potential vulnerability even if he has zero privileges in that database. We would be much happier if he could not connect at all.

In Postgres there is a solution to this, but I've grown to not like it. In Postgres you can specify that a user can only connect to a database if they are in a group that has the same name as the database. This is easy to set up, but it requires changing the default configuration of Postgres. However, for the sheer challenge of it I'd like to work out how to do it without requiring that change. So far I'm still puzzling this out. I'd also like to know that the approach would work at very least on MS SQL Server and mySql.

Conclusion

Most of what is in this week's essay is not that radical to any informed database veteran. But to web programmers who were unfortunate enough to grow up in the world of relational-databases-must-die nonsense, it is probably hard or impossible to imagine a system where users are connecting with real database accounts. The ironic thing is that the approached described here is far more secure than any single sign-on system, but it requires the programmer to shift thinking away from action-based code-centric models to what is really going on: table-based privileges. Once that hurdle is past, the rest of it comes easy.

Sunday, February 1, 2009

This Application Has Unique Business Rule Needs

No it does not. If it did, then your customer/employer would be doing something no other human being has ever done, which is unlikely in the extreme. The application may be unique in its particulars, but it is almost certainly extremely common in its patterns. This week we will see how "unique" needs are in fact nothing but common ordinary development projects.

Beginning With the Conclusion

I have had this conversation with many programmers over the past few years, and it always follows the same patterns. The easy part of the argument is showing the programmer that what he thinks is special or unique is in fact common. The much harder part, because it involves the delicate human ego, is showing the programmer that he has not seen this because he is ignorant. This is not fun to do and I myself usually skip it, it's usually not worth the trouble.

Path 1: Details

Occasionally I speak to a programmer who thinks he has a unique situation. His claim begins with the mountain of details he must handle, details which appear to be contradictory, subtle, and overall perplexing. He wonders if some new approach is required to handle them.

In answering this claim, we begin with the easy part, showing that the situation is itself not unique. In short, all worthwhile projects involve mountains of detail, so there is nothing special there. When it comes to the subtleties and the maze of exceptions and special cases, these are common in mature businesses that have evolved this complexity in response to business needs over the years. So again there is nothing unique here, the programmer's situation is again common.

At this point we have to ask how the programmer will deal with this perplexing mountain of detail. If he knows what he is doing, he will give the general answer that he is going to break it down as much as possible into independent smaller problems that can be solved on their own. Since this is nothing more than how all programmers solve complex problems, the entire "uniqueness" claim has completely collapsed. His project is utterly common.

The much harder part of the conversation comes if the programmer does not know how to break down the problem. For instance, if the problem is all about a fiendishly complex pricing system with lots of discounts and pricing levels, and the programmer does not know that he needs to begin with the database, and he further does not want to hear that, well, there is not much I can do for him. He will end up working a lot harder than he needs to, and will probably remain convinced he is dealing with something "unique".

But let's go a little deeper into that example of the complicated pricing system. Why do I claim that he must start with the tables, and that is he is wasting time if he does not? Well, a complete answer is much more than will fit here, and in fact I hit that theme over and over in these essays, but it comes down to:

  • He must have an accurate and precise description of the details that govern the pricing scheme. That is what tables are for.
  • In working out the mechanics of the tables, particularly their primary and foreign keys, he will come to a his most complete understanding of the mechanisms involved.
  • When the tables completely reflect the details he must work with, the code will just about write itself.
  • Lastly, but probably most importantly, the customer will expect to control the pricing system by adjusting the parameters at all levels. Again, that is what tables are for. The user is in control of the pricing system if he can edit the tables (because of course he cannot edit the code).

Path 2: Combinations

Once upon a time we had simple desktop business applications, games, and then this weird new thing, "the web". Now they are all mixed together, as we play games on the internet that are tied into huge databases. Modern applications often combine technologies that used to be comfortably separate. On any particular project, some of the requirements look like they can be met with an RDBMS, some require management and delivery of media such as MP3 or video, and he is told as well he must provide RSS feeds and import data coming in XML format. Perhaps as well there will be stone tablets and papyrus scrolls.

This programmer may believe he is in a unique situation because of this combination of needs. Because no single toolset out there can meet the entire project, perhaps this is something never before seen? But this does not hold up. Just like the argument about complexity, he must break the problem up correctly, and when he has done so he will have a perfectly ordinary project. Though I might add it will also be a very interesting project and probably a lot of fun.

In The End It Is All About Patterns

I have given two examples above taken from my own experience where programmers have claimed to me that they faced some unique situation. There are many other cases, and they always make perfect sense to the person who thinks he has discovered something new. The biggest flaw in the programmer's thinking is failing to distinguish between particulars and patterns.

My claim in this essay is that the patterns of all problems are the same. Somebody has seen it before, somebody has done it before, the answer is out there. The process of analysis and programming is about slotting your particulars in the patterns that have already been established.

In the broadest sense all programs process data, and particular programs break down into broad patterns of data access and manipulation. Sometimes you have a broad range of users putting in data with very little massaging (think twitter) and sometimes you have one group controlling much of the data while others make use of it (think Amazon), and sometimes your data is mostly relational and table based (think any ecommerce or biz app) and sometimes its mostly media (think youtube).

Once you have these broad patterns identified, you can then proceed to make use of established practices within each particular area. What is the best way to provide sensitive data on the web and protect it from unauthorized eyes? Somebody has done it before. What is the best way to track large amounts of media? Somebody has done it before. What is the best way to set up a complex pricing system with lots of discounts and pricing levels? Somebody has done it before. In all cases, your particulars may be different, but the patterns will be the same.

Conclusion: Find the Patterns

Whenever I find myself looking at a situation that appears to be new, I try to tell myself that it may be new to me, but it is not likely to be new to the human race. If it does not appear to follow a well-known pattern then I proceed as if I have not yet recognized the pattern and continue to analyze and break it apart until the pattern emerges. So far it always has.