Unicorn Eating Memory

Krishna Prasad Varma picture Krishna Prasad Varma · Nov 29, 2011 · Viewed 8.5k times · Source

I have a m1.small instance in amazon with 8GB hard disk space on which my rails application runs. It runs smoothly for 2 weeks and after that it crashes saying the memory is full. App is running on rails 3.1.1, unicorn and nginx

I simply dont understand what is taking 13G ?
I killed unicorn and 'free' command is showing some free space while df is still saying 100%
I rebooted the instance and everything started working fine.

free (before killing unicorn)

             total       used       free     shared    buffers     cached  
Mem:       1705192    1671580      33612          0     321816     405288  
-/+ buffers/cache:     944476     760716   
Swap:       917500      50812     866688 

df -l (before killing unicorn)

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837520         4 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data  

sudo du -hc --max-depth=1 (before killing unicorn)

28K ./root  
6.6M    ./etc  
4.0K    ./opt  
9.7G    ./data  
1.7G    ./usr  
4.0K    ./media  
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory  
du: cannot access `./proc/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/fdinfo/4': No such file or directory  
0   ./proc  
14M ./boot  
120K    ./dev  
1.1G    ./home  
66M ./lib  
4.0K    ./selinux  
6.5M    ./sbin  
6.5M    ./bin  
4.0K    ./srv  
148K    ./tmp  
16K ./lost+found  
20K ./mnt  
0   ./sys  
253M    ./var  
13G .  
13G total   

free (after killing unicorn)

             total       used       free     shared    buffers     cached    
Mem:       1705192     985876     **719316**          0     365536     228576    
-/+ buffers/cache:     391764    1313428    
Swap:       917500      46176     871324  

df -l (after killing unicorn)

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837516         8 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data  

unicorn.rb

rails_env = 'production'  

working_directory "/home/user/app_name"  
worker_processes 5  
preload_app true  
timeout 60  

rails_root = "/home/user/app_name"  
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048  
# listen 3000, :tcp_nopush => false  

pid "#{rails_root}/tmp/pids/unicorn.pid"  
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"  
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"  

GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)  

before_fork do |server, worker|  
  ActiveRecord::Base.connection.disconnect!  

  ##  
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and  
  # immediately start loading up a new version of itself (loaded with a new  
  # version of our app). When this new Unicorn is completely loaded  
  # it will begin spawning workers. The first worker spawned will check to  
  # see if an .oldbin pidfile exists. If so, this means we've just booted up  
  # a new Unicorn and need to tell the old one that it can now die. To do so  
  # we send it a QUIT.  
  #  
  # Using this method we get 0 downtime deploys.  

  old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"  
  if File.exists?(old_pid) && server.pid != old_pid  
    begin  
      Process.kill("QUIT", File.read(old_pid).to_i)  
    rescue Errno::ENOENT, Errno::ESRCH  
      # someone else did our job for us  
    end  
  end  
end  


after_fork do |server, worker|  
  ActiveRecord::Base.establish_connection  
  worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'  
end  

Answer

Kazuki Ohta picture Kazuki Ohta · Nov 18, 2012

i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request.

It's really easy to use. No external tool is required. At first, please add this line to your Gemfile.

gem 'unicorn-worker-killer'

Then, please add the following lines to your config.ru.

# Unicorn self-process killer
require 'unicorn/worker_killer'

# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 10240 + Random.rand(10240)

# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (96 + Random.rand(32)) * 1024**2

It's highly recommended to randomize the threshold to avoid killing all workers at once.