what is the advantage of using supervisord over monit

murtaza52 picture murtaza52 · Aug 28, 2012 · Viewed 21.3k times · Source

We have a custom setup which has several daemons (web apps + background tasks) running. I am looking at using a service which helps us to monitor those daemons and restart them if their resource consumption exceeds over a level.

I will appreciate any insight on when one is better over the other. As I understand monit spins up a new process while supervisord starts a sub process. What is the pros and cons of this approach ?

I will also be using upstart to monitor monit or supervisord itself. The webapp deployment will be done using capistrano.

Thanks

Answer

cressie176 picture cressie176 · Dec 31, 2014

I haven't used monit but there are some significant flaws with supervisord.

  1. Programs should run in the foreground

This means you can't just execute /etc/init.d/apache2 start. Most times you can just write a one liner e.g. "source /etc/apache2/envvars && exec /usr/sbin/apache2 -DFOREGROUND" but sometimes you need your own wrapper script. The problem with wrapper scripts is that you end up with two processes, a parent and child. See the the next flaw...

  1. supervisord does not manage child processes

If your program starts child process, supervisord wont detect this. If the parent process dies (or if it's restarted using supervisorctl) the child processes keep running but will be "adopted" by the init process and stay running. This might prevent future invocations of your program running or consume additional resources. The recent config options stopasgroup and killasgroup are supposed to fix this, but didn't work for me.

  1. supervisord has no dependency management - see #122

I recently setup squid with qlproxy. qlproxyd needs to start first otherwise squid can fail. Even though both programs were managed with supervisord there was no way to ensure this. I needed to write a start script for squid that made it wait for the qlproxyd process. Adding the start script resulted in the orphaned process problem described in flaw 2

  1. supervisord doesn't allow you to control the delay between startretries

Sometimes when a process fails to start (or crashes), it's because it can't get access to another resource, possibly due to a network wobble. Supervisor can be set to restart the process a number of times. Between restarts the process will enter a "BACKOFF" state but there's no documentation or control over the duration of the backoff.

In its defence supervisor does meet our needs 80% of the time. The configuration is sensible and documentation pretty good.