Deploying Django with gunicorn and nginx

Robert Smith picture Robert Smith · Oct 22, 2012 · Viewed 16.3k times · Source

This is a broad question but I'd like to get a canonical answer. I have been trying to deploy a site using gunicorn and nginx in Django. After reading tons of tutorials I have been successful but I can't be sure that the steps I followed are good enough to run a site without problems or maybe there are better ways to do it. That uncertainty is annoying.

That's why I'm looking for a very detailed and well explained answer for newbies. I don't want to explain too much what I know and what I don't know since this could skew the answers a bit and other people could benefit to a lesser degree from your answers. However, some things I'd like to see mentioned are:

  • What "setup" have you seen work best? I used virtualenv and moved my Django project inside this environment, however I have seen another setups where there is a folder for virtual environments and other for projects.

  • How can I setup things in a way that allows several sites to be hosted in a single server?

  • Why some people suggest using gunicorn_django -b 0.0.0.0:8000 and others suggest gunicorn_django -b 127.0.0.1:8000? I tested the latter in an Amazon EC2 instance but it didn't work while the former worked without problem.

  • What is the logic behind the config file of nginx? There are so many tutorials using drastically different configuration files that I'm confused on which one is better. For example, some people use alias /path/to/static/folder and others root /path/to/static/folder. Maybe you can share your preferred configuration file.

  • Why do we create a symlink between site-available and sites-enabled in /etc/nginx?

  • Some best practices are as always welcomed :-)

Thanks

Answer

Burhan Khalid picture Burhan Khalid · Oct 22, 2012

What "setup" have you seen work best? I used virtualenv and moved my django project inside this environment, however I have seen another setups where there is a folder for virtual environments and other for projects.

virtualenv is a way to isolate Python environments; as such, it doesn't have a large part to play at deployment - however during development and testing it is a requirement if not highly recommended.

The value you would get from virtualenv is that it allows you to make sure that the correct versions of libraries are installed for the application. So it doesn't matter where you stick the virtual envrionment itself. Just make sure you don't include it as part of the source code versioning system.

The file system layout is not critical. You will see lots of articles extolling the virtues of directory layouts and even skeleton projects that you can clone as a starting point. I feel this is more of a personal preference than a hard requirement. Sure its nice to have; but unless you know why, it doesn't add any value to your deployment process - so don't do it because some blog recommends it unless it makes sense for your scenario. For example - no need to create a setup.py file if you don't have a private PyPi server that is part of your deployment workflow.

How can I setup things in a way that allows several sites to be hosted in a single server?

There are two things you need to do multiple site setups:

  1. A server that is listening on the public IP on port 80 and/or port 443 if you have SSL.
  2. A bunch of "processes" that are running the actual django source code.

People use nginx for #1 because its a very fast proxy and it doesn't come with the overhead of a comprehensive server like Apache. You are free to use Apache if you are comfortable with it. There is no requirement that "for mulitple sites, use nginx"; you just need a service that is listening on that port, knows how to redirect (proxy) to your processes running the actual django code.

For #2 there are a few ways to start these processes. gevent/uwsgi are the most popular ones. The only thing to remember here is do not use runserver in production.

Those are the absolute minimum requirements. Typically people add some sort of process manager to control all the "django servers" (#2) running. Here you'll see upstart and supervisor mentioned. I prefer supervisor as it doesn't need to take over the entire system (unlike upstart). However, again - this is not a hard requirement. You could perfectly run a bunch of screen sessions and detatch them. The downside is, should your server restart, you would have to relaunch the screen sessions.

Personally I would recommend:

  1. Nginx for #1
  2. Take your pick between uwsgi and gunicorn - I use uwsgi.
  3. supervisor for managing the backend processes.
  4. Individual system accounts (users) for each application you are hosting.

The reason I recommend #4 is to isolate permissions; again, not a requirement.

Why some people suggest using gunicorn_django -b 0.0.0.0:8000 and others suggest gunicorn_django -b 127.0.0.1:8000? I tested the latter in an Amazon EC2 instance but it didn't work while the former worked without problem.

0.0.0.0 means "all IP addresses" - its a meta address (that is, a placeholder address). 127.0.0.1 is a reserved address that always points to the local machine. That is why its called "localhost". It is only reachable to processes running on the same system.

Typically you have the front end server (#1 in the list above) listening on the public IP address. You should explicitly bind the server to one IP address.

However, if for some reason you are on DHCP or you don't know what the IP address will be (for example, its a newly provisioned system), you can tell nginx/apache/any other process to bind to 0.0.0.0. This should be a temporary stop-gap measure.

For production servers you'll have a static IP. If you have a dynamic IP (DHCP), then you can leave in 0.0.0.0. It is very rare that you'll have DHCP for your production machines though.

Binding gunicorn/uwsgi to this address is not recommended in production. If you bind your backend process (gunicorn/uwsgi) to 0.0.0.0, it may become accessible "directly", bypassing your front-end proxy (nginx/apache/etc); someone could just request http://your.public.ip.address:9000/ and access your application directly especially if your front-end server (nginx) and your back end process (django/uwsgi/gevent) are running on the same machine.

You are free to do it if you don't want to have the hassle of running a front-end proxy server though.

What is the logic behind the config file of nginx? There are so many tutorials using drastically different configuration files that I'm confused on which one is better. For example, some people use "alias /path/to/static/folder" and others "root /path/to/static/folder". Maybe you can share your preferred configuration file.

First thing you should know about nginx is that it is not a webserver like Apache or IIS. It is a proxy. So you'll see different terms like 'upstream'/'downstream' and multiple "servers" being defined. Take some time and go through the nginx manual first.

There are lots of different ways to set up nginx; but here is one answer to your question on alias vs. root. root is an explicit directive that binds the document root (the "home directory") of nginx. This is the directory it will look at when you give a request without a path like http://www.example.com/

alias means "map a name to a directory". Aliased directories may not be a sub directory of the document root.

Why do we create a symlink between site-available and sites-enabled in /etc/nginx?

This is something unique to debian (and debian-like systems like ubuntu). sites-available lists configuration files for all the virtual hosts/sites on the system. A symlink from sites-enabled to sites-available "activates" that site or virtual host. It is a way to separate configuration files and easily enable/disable hosts.