How to use Consul in leader election?

Alexandre Santos picture Alexandre Santos · Dec 28, 2014 · Viewed 8.6k times · Source

How do I use Consul to make sure only one service is performing a task?

I've followed the examples in http://www.consul.io/ but I am not 100% sure which way to go. Should I use KV? Should I use services? Or should I use a register a service as a Health Check and make it be callable by the cluster at a given interval?

For example, imagine there are several data centers. Within every data center there are many services running. Every one of these services can send emails. These services have to check if there are any emails to be sent. If there are, then send the emails. However, I don't want the same email be sent more than once.

How would it make sure all emails are sent and none was sent more than once?

I could do this using other technologies, but I am trying to implement this using Consul.

Answer

jeremyjjbrown picture jeremyjjbrown · Jun 26, 2015

This is exactly the use case for Consul Distributed Locks

For example, let's say you have three servers in different AWS availability zones for fail over. Each one is launched with:

consul lock -verbose lock-name ./run_server.sh

Consul agent will only run the ./run_server.sh command on which ever server acquires the lock first. If ./run_server.sh fails on the server with the lock Consul agent will release the lock and another node which acquires it first will execute ./run_server.sh. This way you get fail over and only one server running at a time. If you registered your Consul health checks properly you'll be able to see that the server on the first node failed and you can repair and restart the consul lock ... on that node and it will block until it can acquire the lock.

Currently, Distributed Locking can only happen within a single Consul Datacenter. But, since it is up to you to decide what a Consul Servers make up a Datacenter, you should be able to solve your issue. If you want locking across Federated Consul Datacenters you'll have to wait for it, since it's a roadmap item.