Handle long-running processes in NodeJS?

U Avalos picture U Avalos · Oct 6, 2015 · Viewed 19k times · Source

I've seen some older posts touching on this topic but I wanted to know what the current, modern approach is.

The use case is: (1) assume you want to do a long running task on a video file, say 60 seconds long, say jspm install that can take up to 60 seconds. (2) you can NOT subdivide the task.

Other requirements include:

  • need to know when a task finishes
  • nice to be able to stop a running task
  • stability: if one task dies, it doesn't bring down the server
  • needs to be able to handle 100s of simultaneous requests

I've seen these solutions mentioned:

Which is the modern, standard-based approach? Also, if nodejs isn't suited for this type of task, then that's also a valid answer.

Answer

U Avalos picture U Avalos · Oct 10, 2015

The short answer is: Depends

If you mean a nodejs server, then the answer is no for this use case. Nodejs's single-thread event can't handle CPU-bound tasks, so it makes sense to outsource the work to another process or thread. However, for this use case where the CPU-bound task runs for a long time, it makes sense to find some way of queueing tasks... i.e., it makes sense to use a worker queue.

However, for this particular use case of running JS code (jspm API), it makes sense to use a worker queue that uses nodejs. Hence, the solution is: (1) use a nodejs server that does nothing but queue tasks in the worker queue. (2) use a nodejs worker queue (like kue) to do the actual work. Use cluster to spread the work across different CPUs. The result is a simple, single server that can handle hundreds of requests (w/o choking). (Well, almost, see the note below...)

Note:

  • the above solution uses processes. I did not investigate thread solutions because it seems that these have fallen out of favor for node.
  • the worker queue + cluster give you the equivalent of a thread pool.
  • yea, in the worst case, the 100th parallel request will take 25 minutes to complete on a 4-core machine. The solution is to spin up another worker queue server (if I'm not mistaken, with a db-backed worker queue like kue this is trivial---just make each point server point to the same db).