optimal architecture for multitenant application on django

Oliver Rehburg picture Oliver Rehburg · Aug 25, 2011 · Viewed 7.4k times · Source

I've been brooding over the right/optimal way to create a multitenancy application based on Django.

Some explanation:

  • Application can be used by several tenants (tenant1, tenant2, ...,).

  • All tenant-individual data has to be secured against access of other tenants (and their users).

  • Optionally tenants can create additional custom-fields for application-objects.

  • Of course, underlying hardware limits number of tenants on one "system".

1) Separating each tenant by e.g. sub-domain and using tenant-specific databases in the underlying layer

2) Using some tenant-ID in the model to separate the tenant-data in the database

I am thinking about deployment-processes, performance of the system-parts (web-server(s), database-server(s), working-node(s),...)

What would be the best setup ? Where are the pro's and con's?

What do you think?

Answer

Reto Aebersold picture Reto Aebersold · Aug 25, 2011

We built a multitenancy platform using the following architecture. I hope you can find some useful hints.

  • Each tenant gets sub-domain (t1.example.com)
  • Using url rewriting the requests for the Django application are rewritten to something like example.com/t1
  • All url definitions are prefixed with something like (r'^(?P<tenant_id>[\w\-]+)
  • A middleware processes and consumes the tenant_id and adds it to the request (e.g. request.tenant = 't1')
  • Now you have the current tenant available in each view without specifying the tenant_id argument every view
  • In some cases you don't have the request available. I solved this issue by binding the tenant_id to the current thread (similar to the current language using threading.local )
  • Create decorators (e.g a tenant aware login_required), middlewares or factories to protect views and select the right models
  • Regarding to the databases I used two different scenarios:
    • Setup multiple databases and configure a routing according to current tenant. I used this first but switched to one database after about one year. The reasons were the following:
      • We didn't need a high secure solution to separate the data
      • The different tenants used almost all the same models
      • We had to manage a lot of databases (and didn't built an easy update/migration process)
    • Use one database with some simple mapping tables for i.e. users and different models. To add additional and tenant specific model fields we use model inheritance.

Regarding the environment we use the following setup:

From my point of view this setup has the following pro's and con's:

Pro:

  • One application instance knowing the current tenant
  • Most parts of the project don't have to bother with tenant specific issues
  • Easy solution for sharing entities between all tenants (e.g. messages)

Contra:

  • One quite large database
  • Some very similar tables due to the model inheritance
  • Not secured on the database layer

Of course the best architecture strongly depends on your requirements as number of tenants, the delta of your models, security requirements and so on.

Update: As we reviewed our architecture, I suggest to not rewrite the URL as indicated in point 2-3. I think a better solutions is to put the tenant_id as a Request Header and extract (point 4) the tenant_id out of the request with something like request.META.get('TENANT_ID', None). This way you get neutral URLs and it's much easier to use Django built-in functions (e.g. {% url ...%} or reverse()) or external apps.