Database schema for ACL

Xeoncross picture Xeoncross · May 3, 2011 · Viewed 31.1k times · Source

I want to create a schema for a ACL; however, I'm torn between a couple of ways of implementing it.

I am pretty sure I don't want to deal with cascading permissions as that leads to a lot of confusion on the backend and for site administrators.

I think I can also live with users only being in one role at a time. A setup like this will allow roles and permissions to be added as needed as the site grows without affecting existing roles/rules.

At first I was going to normalize the data and have three tables to represent the relations.

ROLES { id, name }
RESOURCES { id, name }
PERMISSIONS { id, role_id, resource_id }

A query to figure out whether a user was allowed somewhere would look like this:

SELECT id FROM resources WHERE name = ?
SELECT * FROM permissions WHERE role_id = ? AND resource_id = ? ($user_role_id, $resource->id)

Then I realized that I will only have about 20 resources, each with up to 5 actions (create, update, view, etc..) and perhaps another 8 roles. This means that I can exercise blatant disregard for data normalization as I will never have more than a couple of hundred possible records.

So perhaps a schema like this would make more sense.

ROLES { id, name }
PERMISSIONS { id, role_id, resource_name }

which would allow me to lookup records in a single query

SELECT * FROM permissions WHERE role_id = ? AND permission  = ? ($user_role_id, 'post.update')

So which of these is more correct? Are there other schema layouts for ACL?

Answer

Denis de Bernardy picture Denis de Bernardy · May 10, 2011

In my experience, the real question mostly breaks down to whether or not any amount of user-specific access-restriction is going to occur.

Suppose, for instance, that you're designing the schema of a community and that you allow users to toggle the visibility of their profile.

One option is to stick to a public/private profile flag and stick to broad, pre-emptive permission checks: 'users.view' (views public users) vs, say, 'users.view_all' (views all users, for moderators).

Another involves more refined permissions, you might want them to be able to configure things so they can make themselves (a) viewable by all, (b) viewable by their hand-picked buddies, (c) kept private entirely, and perhaps (d) viewable by all except their hand-picked bozos. In this case you need to store owner/access-related data for individual rows, and you'll need to heavily abstract some of these things in order to avoid materializing the transitive closure of a dense, oriented graph.

With either approach, I've found that added complexity in role editing/assignment is offset by the resulting ease/flexibility in assigning permissions to individual pieces of data, and that the following to worked best:

  1. Users can have multiple roles
  2. Roles and permissions merged in the same table with a flag to distinguish the two (useful when editing roles/perms)
  3. Roles can assign other roles, and roles and perms can assign permissions (but permissions cannot assign roles), from within the same table.

The resulting oriented graph can then be pulled in two queries, built once and for all in a reasonable amount of time using whichever language you're using, and cached into Memcache or similar for subsequent use.

From there, pulling a user's permissions is a matter of checking which roles he has, and processing them using the permission graph to get the final permissions. Check permissions by verifying that a user has the specified role/permission or not. And then run your query/issue an error based on that permission check.

You can extend the check for individual nodes (i.e. check_perms($user, 'users.edit', $node) for "can edit this node" vs check_perms($user, 'users.edit') for "may edit a node") if you need to, and you'll have something very flexible/easy to use for end users.

As the opening example should illustrate, be wary of steering too much towards row-level permissions. The performance bottleneck is less in checking an individual node's permissions than it is in pulling a list of valid nodes (i.e. only those that the user can view or edit). I'd advise against anything beyond flags and user_id fields within the rows themselves if you're not (very) well versed in query optimization.