Understanding NodeJS & Non-Blocking IO

Vishwas Shashidhar picture Vishwas Shashidhar · Aug 4, 2013 · Viewed 10.8k times · Source

So, I've recently been injected with the Node virus which is spreading in the Programming world very fast.

I am fascinated by it's "Non-Blocking IO" approach and have indeed tried out a couple of programs myself.

However, I fail to understand certain concepts at the moment.

I need answers in layman terms (someone coming from a Java background)

1. Multithreading & Non-Blocking IO.

Let's consider a practical scenario. Say, we have a website where users can register. Below would be the code.

..
..
   // Read HTTP Parameters
   // Do some Database work
   // Do some file work
   // Return a confirmation message
..
..

In a traditional programming language, the above happens in a sequential way. And, if there are multiple requests for registration, the web server creates a new thread and the rest is history. Of course, programmers can create threads of their own to work on Line 2 and Line 3 simultaneously.

In Node, as I understand, Lines 2 & 3 will be run in parallel while the rest of the program gets executed and the Interpreter polls the lines 2 & 3 every 'x' ms.

Now, my question is, if Node is a single threaded language, what does the job of lines 2 & 3 while the rest of the program is being executed?

2. Scalability

I recently read that LinkedIn have adapted Node as a back-end for their Mobile Apps and have seen massive improvements.

Can anyone explain how it has made such a difference?

3. Adapting in other programming languages

If people are claiming that Node to be making a lot of difference when it comes to performance, why haven't other programming languages adapted this Non-Blocking IO paradigm?

I'm sure I'm missing something. Only if you can explain me and guide me with some links, would be helpful.

Thanks.

Answer

Nathaniel Travis picture Nathaniel Travis · Aug 4, 2013

A similar question was asked and probably contains all the info you're looking for: How the single threaded non blocking IO model works in Node.js

But I'll briefly cover your 3 parts:

1.
Lines 2 and 3 in a very simple form could look like:
      db.query(..., function(query_data) { ... });
      fs.readFile('/path/to/file', function(file_data) { ... });

Now the function(query_data) and function(file_data) are callbacks. The functions db.query and fs.readFile will send the actual I/O requests but the callbacks allow the processing of the data from the database or the file to be delayed until the responses are received. It doesn't really "poll lines 2 and 3". The callbacks are added to an event loop and associated with some file descriptors for their respective I/O events. It then polls the file descriptors to see if they are ready to perform I/O. If they are, it executes the callback functions with the I/O data.

I think the phrase "Everything runs in parallel except your code" sums it up well. For example, something like "Read HTTP parameters" would execute sequentially, but I/O functions like in lines 2 and 3 are associated with callbacks that are added to the event loop and execute later. So basically the whole point is it doesn't have to wait for I/O.

2.
Because of the things explained in 1., Node scales well for I/O intensive requests and allows many users to be connected simultaneously. It is single threaded, so it doesn't necessarily scale well for CPU intensive tasks.

3.
This paradigm has been used with JavaScript because JavaScript has support for callbacks, event loops and closures that make this easy. This isn't necessarily true in other languages.

I might be a little off, but this is the gist of what's happening.