Should server/database config files, including passwords, be stored in source control?

philfreo picture philfreo · Nov 22, 2010 · Viewed 10.9k times · Source

I'm looking to hear some best practices...

Assuming a web application that interacts with a few different production servers (databases, etc.)... should the configuration files that include database passwords be stored in source control (e.g., git, svn)?

If not, what's the best way to keep track of server database (or other related) passwords that your application needs access to?

Edit: added a bounty to encourage more discussion and to hear what more people consider best practice.

Answer

GreyCat picture GreyCat · Nov 30, 2010

There's no single "silver bullet" answer here and it would all greatly depend on details.

First of all, I consider best practice to separate all source code from configuration in separate repository. So, source code remains source code, but it's installation or deployment (with configuration, passwords, etc) is the whole other thing. This way you'll firmly separate developers' tasks from sysadmins' tasks and can ultimately build 2 distinct teams doing what's they're good at.

When you have separate source code repository + deployment repository, your best next bet is considering deployment options. Best way I see here is using deployment procedures typical for a chosen OS (i.e. building autonomous packages for a chosen OS the way that OS's maintainers do).

For example, Red Hat or Debian packaging procedures usually mean grabbing a tarball of software from external site (that would be exporting sources from your source code VCS), unpacking it, compiling and preparing packages ready for deployment. Deployment itself should ideally mean just doing a quick & simple command that would install the packages, such as rpm -U package.rpm, dpkg --install package.deb or apt-get dist-upgrade (given that your built packages go to a repository where apt-get would be able to find them).

Obviously, to get it working this way, you'll have to supply all configuration files for all components of a system in a fully working state, including all addresses and credentials.

To get more concise, let's consider a typical "small service" situation: one PHP application deployed across n application servers running apache / mod_php, accessing m MySQL servers. All these servers (or virtual containers, that doesn't really matter) reside in a protected private network. To make this example easier, let's assume that all real internet connectivity is fronted by a cluster of k http accelerators / reverse proxies (such as nginx / lighttpd / apache) which have very easy configuration (just internal IPs to forward to).

What do we have for them to be connected and fully working?

  • MySQL servers: set up IPs/hostnames, set up databases, provide logins & passwords
  • PHP application: set up IPs/hostnames, create configuration file that will mention MySQL servers IPs, logins, passwords & databases

Note that there are 2 different "types" of information here: IPs/hostnames is something fixed, you'd likely want to assign them once and for all. Logins & passwords (and even database names), on the other hand, are purely for connectivity purposes here - to make sure for MySQL that it's really our PHP application connecting to it. So, my recommendations here would be splitting these 2 "types":

  • "Permanent" information, such as IPs, should be stored in some VCS (different from source code VCS)
  • "Transient" information, such as passwords between 2 applications, should be never stored, but generated during generation of deployment packages.

The last and the toughest question remains here: how to create deployment packages? There are multiple techniques available, 2 main ways are:

  • Exported source code from VCS1 + "permanent" configuration from VCS2 + building script from VCS3 = packages
  • Source code is in VCS1; VCS2 is a distributed version control (like git or hg) which essentially contains "forks" of VCS1 + configuration information + building scripts which can generate . I personally like this approach better, it's much shorter and ultimately easier to use, but learning curve may be a bit steeper, especially for admin guys who'll have to master git or hg for it.

For an example above, I'd create packages like:

  • my-application-php - which would depend on mod_php, apache and would include generated file like /etc/my-php-application/config.inc.php that will include MySQL database IPs/hostnames and login / password generated as md5(current source code revision + salt). This package would be installed on every of n application servers. Ideally, it should be able install on a cleanly installed OS and make a fully working application cluster node without any manual activity.
  • my-application-mysql - which would depend on MySQL-server and would include post-install script that:
    • starts MySQL server and makes sure it will start automatically on OS start
    • connects to MySQL server
    • checks if required database exists
    • if no - creates the database, bootstraps it with contents and creates login with password (the same logins & passwords as generated in /etc/my-php-application/config.inc.php, using md5 algorithm)
    • if yes - connects to the database, applies migrations to bring it up to the new version, kills all older logins / passwords and recreates the new login/password pair (again, generated using md5(revision + salt) method)

Ultimately, it should bring the benefit of upgrading your deployment using single command like generate-packages && ssh-all apt-get dist-upgrade. Also, you do not store inter-applications passwords anywhere and they get regenerated on every update.

This fairly simple example illustrates a lot of methods you can employ here - but, ultimately, it's up to you to decide which solution is better here and which one is overkill. If you'll put more details here or as a separate question, I'll gladly try to get into details.