Sunday, May 11, 2008

Introducing Database Security

For a database application, all security in the end becomes a question of who can read and write to what tables. While this is obvious to veteran database programmers, it is not always so obvious to relative newcomers. A programmer who thinks primarily in terms of code will be used to conceptualizing actions, steps, procedures and so forth. This programmer will therefore think of security in terms of who is allowed to perform what actions. However, when you examine the actions performed by database application code, you find that, no surprise, it is a lot of database reads and writes.

This week in The Database Programmer we are going to see the basics of table-based security. This is a blog for anybody who wants to learn the practical realities of working with databases. A new essay is published every Monday morning, and the Complete Table of Contents is here.

Table Permissions 101

The most basic unit of storage for a database is the table, and so security begins with deciding who can do what to a table. The SQL language has one read command and three write commands (INSERT, UPDATE, and DELETE), so there are four possible permissions that can be granted to a user. Most databases out there provide syntax that is pretty close to this, which makes some direct permission assignments to a group named "ARSTAFF"

CREATE GROUP ARSTAFF
GRANT SELECT ON RECEIVABLES TO ARSTAFF
GRANT INSERT ON RECEIVABLES TO ARSTAFF
GRANT UPDATE ON RECEIVABLES TO ARSTAFF
GRANT DELETE ON RECEIVABLES TO ARSTAFF

These commands should be run at the same time that your tables are built, they are part of the basic database build.

A Basic Security Dictionary

A very basic security dictionary will list all tables and show who can do what to what. The dictionary begins with a single table, which lists tables and groups and permissions, something like this:

TABLE        | GROUP     | SELECT | INSERT | UPDATE | DELETE
-------------+-----------+--------+--------+--------+--------
AR_INVOICES  | AR_STAFF  | 1      | 1      | 1      | 0
AR_PAYMENTS  | AR_STAFF  | 1      | 1      | 1      | O
CUSTOMERS    | AR_STAFF  | 1      | 1      | 1      | 0
CUST_TYPES   | AR_STAFF  | 1      | 0      | 0      | 0
CUST_TYPES   | AR_ADMIN  | 1      | 1      | 1      | 1

Security permissions should always be defined in terms of groups, not individual users. This is important because it lets you make the group definitions part of the basic definition of the system, leaving the system administrator free to add users to groups at his own discretion.

Deny By Default And Determining a User's Permissions

Most security systems make use of the idea of deny by default. This means you do not have to specify who cannot see a table, the system starts out by assuming that nobody can see anything. Users can then only work with a table if they are in a group that can work with that table.

Another key idea is this: if a user is in even one group that allows a permission, the user gets the permission, even if his other groups are not allowed in.

With these two ideas, deny-by-default, and any-group-gets-you-in, then assuming you have a table that lists what groups each user is in, you can find a particular user's permissions on a table with this query:

SELECT MAX(select) as select
      ,MAX(insert) as insert
      ,MAX(update) as update
      ,MAX(delete) as delete
 FROM secure_dictionary
WHERE table = 'table name'
  AND group IN (list,of,groups,user,is,in)

If the query comes back empty, the user has no rights, otherwise you will get the user's permissions for that table.

Enhancing Our Security Dictionary

The security dictionary portrayed above can become tedious to work with because it often happens that you want to make similar assignments to groups of tables. For instance, in an accounting application you may an "APSTAFF" group that has basic permissions on Accounts Payable tables, and an "APADMIN" group that can do anything to any table in Accounts Payable. However, those groups have no permissions on the Accounts Receivable tables, so you have two new groups, "ARSTAFF" and "ARADMIN". Likewise for inventory, general ledger, and so forth.

This can all be made much easier by organizing your tables into modules and defining default permissions within a module to each group. The expanded dictionary would have these tables:

THE TABLE OF TABLES

TABLE        | MODULE
-------------+---------
AP_INVOICES  | AP
AP_CHECKS    | AP
AR_INVOICES  | AR
AR_PAYMENTS  | AR
GL_ACCOUNTS  | GL
GL_LEDGER    | GL

MODULE PERMISSION

MODULE  | GROUP     | SELECT | INSERT | UPDATE | DELETE
--------+-----------+--------+--------+--------+--------
AP      | AP_STAFF  | 1      | 0      | 0      | 0
AP      | AP_ADMIN  | 1      | 1      | 1      | 1
AR      | AR_STAFF  | 1      | 0      | 0      | 0
AR      | AR_ADMIN  | 1      | 1      | 1      | 1
GL      | GL_STAFF  | 1      | 0      | 0      | 0
GL      | GL_ADMIN  | 1      | 1      | 1      | 1

The permissions above basically let the regular in each group see anything, but write nothing. The admin users in each group can do anything. Specific table assignments to particular tables can then override these defaults to give the *STAFF members increased access on a table-by-table basis.

Our First Secure Table Pattern: The Read-Only Lookup Table

We are now going to see the first table design pattern that is based on security: the read-only lookup table. Consider a table of customer types and discounts. This table does not change that often, and regular staff are not permitted to make changes to the table, only managers can do that.

I call this pattern the "Read-Only Lookup Table." The table holds values that everybody needs to see, but only a few people can modify. The SQL to create a table might look like this:

CREATE GROUP AR_STAFF
CREATE GROUP AR_ADMIN
CREATE TABLE CUSTTYPES (
   custtype char(10) primary key
  ,description char(25)
  ,discount numeric(4,2) 
)
DENY ALL     ON CUSTTYPES TO AR_STAFF
GRANT SELECT ON CUSTTYPES TO AR_STAFF
GRANT ALL    ON CUSTTYPES TO AR_ADMIN

By the way, this is yet another argument against the deep tendency that code grinders have to combine tables that have similar structures, especially lookup tables. We know in theory that combining lookup tables is bad because like things should be stored together in a table and unlike thing should be stored in separate tables. Now we see the practical impact, which is that you cannot implement security on tables that have been combined (just like you can't implement constraints, foreign keys, calculations...) The crucial point here is that table design is not so much about identifying column names and widths, at its heart it is about organizing data. When you mistake a superficial resemblance (similar column structure) for an actual similarity, you confound your own ability to manage either entity correctly.

The Connection, Users, and Groups

You can certainly implement security in your framework by doing on-the-fly lookups to see who is allowed to do what. However, your run-time framework code will be much simpler if the security definitions are built into the server when the tables are built. The drawback is that it requires a dramatic change to how you connect to the database.

Many programmers use frameworks (or write their own) that make only one connection to the database. This connection has the maximum allowed permissions that any user of the system has, and the framework determines what particular users can actually do. This system is so common that many programmers probably assume it is the 11th commandment, and these same programmers are unaware that there is any other way to do it.

But in fact databases are able to create users and groups just the same way that operating systems are, and then to allow you to connect as these various users. Making all connections as a super-user and trusting your code to get it right puts a huge burden on the programmer, and we have a ready example of what can go wrong by looking at a very popular operating system produced in Redmond, Washington. By contrast, a simple table-based system reduces the task to careful review of the much simpler security dictionary.

The decision on which way to go is not necessarily cut and dried. If you have a site with a very simple database and very simple security, and thousands or millions of users, there is much to be gained by caching connections and arbitrating security in code. However, if you are producing line of business programs like accounting, ERP, medical billing or any other app where your users are limited, there is nothing to be gained by caching connections and everything to be gained by simplifying the system and making it easier to code and maintain.

Beyond Table Security

While table security can be vastly simpler than trying to secure programs one by one, it cannot in fact handle all of the cases that might come up. In later essays we will examine column-level security and row-level security. These are not well supported in the popular products available today, so you may have to work it into your framework (which is what I did for my Andromeda framework).

Also, now that we have seen an introduction to database security, we will be seeing more secure table design patterns in the future.

Next Week: Database Triggers, Encapsulation and Composition.

6 comments:

zippy1981 said...

Ken,

A few comments here. First of all, a very good article. Secondly, I understand its meant to be an introduction and I am bringing up things that can be classified as advanced topics.

In general the database server super user and the database owner of a particular database tend to have full access to a database from the get go. Everyone else has default deny.

One case you need to make mention of is stored procedures. Now I've seen two basic forms of the stored proc security model. One is the model used by postgres. The other is the version used by sybase and Microsoft.

In the postgres model, stored procs are executable by anyone by default, but you need permissions to perform all the SQL statement inside the stored proc for it to successfully execute. There is an option for a "sudo" like feature when a stored proc alwyas executes as a certain user, at which point you usually restrict who has access to the stored proc.

In the microsoft/sybase model, or as I like to call it the "wrong way model", no one has access to the stored proc by defaults except database owners, but once you have access to a stored proc, no other security checks are performed.

Finally, you present the two estremes of database connections for applications, each application user gets a database user, or all database apps connect as super user. I would like to present my approach that lies in the middle. I will say the andromeda approach is ideal, but I often code in situations where I can't or don't use andromeda. When I do I use the following approach.

There is still one user per app. However, I break the app up into different parts, and those users have minimal permissions for their app.

For example, I was working on a website a while ago. There were two sites, the user site and the admin site. It was one user for the admin app and one for the user app. Both had distinct and limited permissions. Granted if there was a sql injection vulnerability in my app than a website user could set any users profile to anything they want. However, you could not drop you user or any other user.

Another more complex example involves SOA. I won't describe the apps functionality, but I will say that it was written in .NET and had the follwing architectore:

Web App
|
Web Service 1
|
Web Service 2

Basically, a web app would call one web service, which usually called another. Webserve2 sent data back to webservice1 and webservice 1 returned it to the web app.

Both web services talked to the same database. The web app had no access. Each web service had to read and write to different tables in the database. Therefore, each web service had a different database user. I achieved far superior security to having the apps connect as a super user, significantly better security than my example preceding this one, and significantly inferior security to the realized ideal of a properly configured andromeda database.

I didn't mean to rip into the article. I just feel that stored procs need to be mentioned in a introductory article, even if in a "I will explain those later" way. I also feel you put database security practices into two boxes, when there is a significant middle ground.

zippy1981 said...

One quick note that I think doesn't get mentioned ever in terms of database security.

One of the advantages of Microsoft SQL server is that in addition to creating database users that are stored in the baster database, you can also assign real windows users from active directory or the local computer accounts to a sql server. This eliminates a seperate user store and password to maintain.

I've not seen this feature duplicated elsewhere and I do not understand why. In the land of unix, I would very much like postgres to allow users to be referenced from the pam source, and use pam to perform databae authentication. This way I can assign users from my password file, nis server, or active directory via pam_ldap database permissions without making a seperate user.

KenDowns said...

Zippy: you have some good details there. The stored procs I definitely had in mind for their own essay.

The middle ground you mention also raises the issue for web sites of anonymous access, and how to do privileged things like create new users.

Anonymous said...

@Zippy1981: PostgreSQL does allow you to authenticate users from a plethora of userstores, see http://www.postgresql.org/docs/8.3/interactive/auth-methods.html

zippy1981 said...

@Alex: Thanks for pointing it out. That was something I was made aware of later. I also realize that this includes LDAP access which in turn means Active directory access.

Security Tools said...

Even when encryption is used, threats to confidentiality still exist. Two such threats are cryptographic attacks, or attempts to break the encryption code, and the loss of a private key in a public key cryptography system.