Understanding the Event Loop

Tarik picture Tarik · Feb 6, 2014 · Viewed 50k times · Source

I am thinking about it and this is what I came up with:

Let's see this code below:

console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);

A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.

The results of this execution is the following:

a c d b

So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.

We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.

But my questions arises around the following items:

#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.

#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?

#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:

function downloadFile(filePath, callback)
{
   blah.downloadFile(filePath);
   callback();
}

But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.

#4: But an article says quite contrary to what I was guessing on how things might be working:

The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.

#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:

enter image description here

So my question here is to get some clarifications about the items listed above?

Answer

Peter Lyons picture Peter Lyons · Feb 6, 2014

1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.

There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.

Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.

A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.

2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?

No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.

3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?

There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.

4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.

Yes, this is true, but it's misleading. The key thing is the normal pattern is:

//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
  //The code inside this callback function will absolutely NOT run in tick 1
  //It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done

So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.