I know Rails 5 ships with Puma (which we're using) and will look for RAILS_MAX_THREADS as an environment variable or default to 5 threads, but I'm receiving timeout errors with the default value. I looked at my database and found its max connections is a few thousand.
It may be silly, but is this something Puma will set automatically and scale for, depending on its settings, or do I need to explicitly set this in the environment variables? If it needs to be manually set, what would be a good value for RAILS_MAX_THREADS?
I've found the following helpful, but I'm not fully grasping the scalability part:
https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server https://devcenter.heroku.com/articles/concurrency-and-database-connections
Puma has two parameters actually, the number of threads and the number of workers. If we slightly change the default puma.rb
, it will look like that:
# WORKERS_NUM is not a default env variable name
workers Integer(ENV['WORKERS_NUM'] || 1)
max_threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 1)
min_threads_count = max_threads_count
threads min_threads_count, max_threads_count
The number of workers is the number of separate processes that Puma spawns for you. Usually, it is a good idea to set it equal to the number of processor cores you have on your server. You could spawn more of them to allow for more requests to be processed simultaneously, but workers create additional memory overhead – each worker spins up a copy of your rails app, so usually, you would use threads to achieve higher throughput.
RAILS_MAX_THREADS
is a way to set the number of threads each of your workers will use under the hood. In the example above, the min_threads_count
is equal to the max_threads_count
, so the number of threads is constant. If you set them to be different, it is going to scale from the min to the max, but I haven't seen it in the wild.
There are several reasons to limit the number of threads – your interpreter and response times:
There was also an argument that slow IO blocks ruby process and doesn't allow context switching (i.e. calls to external services, or generating large files on the fly), but it turns out not to be true http://yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/. But optimizing your architecture to do as much work in the background, as possible is always a good idea.
This answer will help you to find out a perfect combination of the number of threads vs the number of workers given your hardware.
This shows how the benchmarking could be done to find the exact numbers.
To sum up:
WORKERS_NUM
multiplied by RAILS_MAX_THREADS
gives you a maximum number of simultaneous connections that can be processed by puma. If the number is too low, your users will see timeouts during load spikes. To achieve the best performance given you use MRI, you need to set WORKERS_NUM
to the number of cores and find optimal RAILS_MAX_THREADS
based on average response time during performance tests.