Graceful shutdown of GenServer

Sheharyar picture Sheharyar · Sep 28, 2016 · Viewed 10.9k times · Source

I writing an Elixir app with GenServer that starts an external application on boot and shuts it down and does other clean-up on exit. I've added bootup functionality in the init/1 callback and cleanup code in the terminate/2 callback.

The init code works fine when the GenServer is started, and the terminate method is also called when the :stop signal is manually sent, but in the cases of unexpected shutdowns and interrupts (as in the case of hitting Ctrl+C) in IEx, the terminate code is not called.


Currently, I've gone over tons of forum threads, blog posts and documentation, including:

From Elixir Docs - GenServers:

If the GenServer receives an exit signal (that is not :normal) from any process when it is not trapping exits it will exit abruptly with the same reason and so not call terminate/2. Note that a process does NOT trap exits by default and an exit signal is sent when a linked process exits or its node is disconnected.

Therefore it is not guaranteed that terminate/2 is called when a GenServer exits. For such reasons, we usually recommend important clean-up rules to happen in separated processes either by use of monitoring or by links themselves.

but I have absolutely no idea how to get :init.stop, linked processes or anything else to work with this (since this is my first time with GenServers).


This is my code:

defmodule MyAwesomeApp do
  use GenServer

  def start do
    GenServer.start_link(__MODULE__, nil)
  end

  def init(state) do
    # Do Bootup stuff

    IO.puts "Starting: #{inspect(state)}"
    {:ok, state}
  end

  def terminate(reason, state) do
    # Do Shutdown Stuff

    IO.puts "Going Down: #{inspect(state)}"
    :normal
  end
end

MyAwesomeApp.start

Answer

sasajuric picture sasajuric · Sep 29, 2016

To increase chances of the terminate callback being invoked, the server process should trap exits. However, even with that, the callback might not be invoked in some situations (e.g. when the process is brutally killed, or when it crashes itself). For more details see here.

As mentioned, if you want to politely shutdown your system, you should invoke :init.stop, which will recursively shutdown the supervision tree causing terminate callbacks to be invoked.

As you noticed, there is no way of catching abrupt BEAM OS process exits from within. It's a self-defining property: the BEAM process terminates suddenly, so it can't run any code (since it terminated) 🙂. Hence, if BEAM is brutally terminated, the callback will not be invoked.

If you unconditionally want to do something when BEAM dies, you need to detect this from another OS process. I'm not sure what's your exact use case, but assuming you have some strong needs for this, then running another BEAM node, on the same (or another) machine, could work here. Then you could have one process on one node monitoring another process on another node, so you can react even if BEAM is brutally killed.

However, your life will be simpler if you don't need to unconditionally run some cleanup logic, so consider whether the code in terminate is a must, or rather a nice-to-have.