I have a Flask app running under Gunicorn, using the sync worker type with 20 worker processes. The app reads a lot of data on startup, which takes time and uses memory. Worse, each process loads its own copy, which causes it to take even longer and take 20X the memory. The data is static and doesn't change. I'd like to load it once and have all 20 workers share it.
If I use the preload_app
setting, it only loads in one thread, and initially only takes 1X memory, but then seems to baloon to 20X once requests start coming in. I need fast random access to the data, so I'd rather not do IPC.
Is there any way to share static data among Gunicorn processes?
Memory mapped files will allow you to share pages between processes.
https://docs.python.org/2/library/mmap.html
Note that memory consumption statistics are usually misleading and unhelpful. It is usually better to consider the output of vmstat and see if you are swapping a lot.