numpy-like package for node

Claude picture Claude · Jul 14, 2015 · Viewed 26.8k times · Source

During my years on Python development, I've always been amazed at how much much much faster things become if you manage to rewrite that code that loops though your ndarray and does something, with numpy functions that work on the whole array at once. More recently I'm switching more and more to node, and I'm looking for something similar. So far I have turned up some things, none of which look promising:

  • scikit-node, runs scikit-learn in python, and interfaces with node. I haven't tried it, but I don't expect it gives me the cutting edge speed that I would like.
  • There are some rather old, and newer, javascript matrix libraries (sylvester, gl-matrix, ...). In addition to not being sure they work well with matrices larger than 4x4 (which is most useful in 3D rendering), they seem to be native javascript (and some, not sure these, use webGL acceleration). Great on the browser, not so on node.

As far as I know, npms can be written in C++, so I'm wondering why there are no numpy-like libraries for node. Is there just not enough interest in node yet from the community that needs that kind of power? Is there a hope that ES6 features (list comprehensions) will allow javascript compilers to automatically vectorise native JS code to C++ speeds? Am I possibly missing something else?

Edit, in response to close-votes: Note, I'm not asking for "what is the best package to do xyz". I'm just wondering if there is a technical reason there is no package to do this on node, a social reason, or no reason at all and there is just a package I missed. Maybe to avoid too many opinionated criticism, I want to know: I have about 10000 matrices that are 100 x 100 each. What's the best (* correction, a reasonable fast) way to add them together?

Edit2 After some more digging, it turned out I was googling for the wrong thing. Google for "node.js scientific computing" and there are links to some very interesting notes:

Basically as far as I understand now, no-one has bothered so far. Also, since there are some major omissions in the js TypedArrays (such as 64bit ints), it might be hard to add good support by just using NPMs, and not hacking the engine itself --- something that would defeat the purpose. Then again, I didn't further research this last statement.

Answer

kgryte picture kgryte · May 18, 2017

No, there are no technical reasons why a numpy-like package does not exist for Node.js and, more generally, JavaScript.

There are two main obstacles preventing Node.js and JavaScript from achieving more mind share in the data science and numeric computing communities.

The first obstacle is community. While the JavaScript community is huge, the subset of people within that community doing interesting things in numeric computing is small. Hence, if you want to do numeric computing in JavaScript and Node.js, finding resources to help you along the way can be hard, and it may feel like a lonely endeavor.

Next, the absence of comparable libraries (chicken and egg: libraries are needed to attract library authors and authors are needed to write good libraries). There are no technical reasons why libraries cannot be written in JavaScript or leverage Node.js (e.g., via native add-ons). I know, as I have written many numeric computing libraries in JavaScript. So while numeric computing is possible in JavaScript, the problem stems from an inability to attract developers having sufficient expertise and capable of putting in the time and effort needed to write high quality numeric computing implementations.

Regarding the specific language features mentioned in the OP:

  • ES6/ES2015: none of the recent language additions help or hinder development of numeric computing libraries in JavaScript. Potential additions like list comprehensions will not be game changers either. The one change to the web platform which will make a difference is WebAssembly. With WebAssembly, compiling C/C++/Fortran libraries to run in web browsers will be made easier. At the time of this answer, WebAssembly looks to be the means for bringing SIMD to the web, potentially allowing some speed-ups, although the focus seems to be on short SIMD, rather than long. But even with WebAssembly, porting numeric computing libraries to the web will not be as simple as hitting the compile button. Numeric computing code bases will need to massaged to become amenable for use on the web, and, even then, higher level APIs will likely need to be written to mask some of lower level features, such as manually managing the heap.
  • Native add-ons: yes, node modules can be written as native add-ons, allowing C/C++/Fortran code to be used within a Node.js application. Individuals have written libraries to this end; for example, see stdlib. If done well, Node.js can perform numeric computations at speeds comparable to directly using native implementations.
  • Typed arrays: as they are now, they are suitable for numeric computation. Similar to C, you can create pooled buffers, which allow for efficient memory reuse and better performance. Furthermore, similar to languages like R, Python, and Julia, you can leverage typed arrays to create ndarray (aka strided array) interfaces. While U/Int64 integer arrays are not currently available at the time of this answer, (a) their absence is not a show stopper and (b) proposals are advancing at the specification level to add U/Int64 integer arrays to JavaScript. Ditto for complex numbers with structured types.

My personal belief is that some form of numeric computing is inevitable in JavaScript and Node.js. The advantages (ubiquity, distribution, performance) and potential applications (edge computing, integrating machine learning, data visualization) are too strong of evolutionary forces not to support data science applications, at least at a basic level.

disclosure: I and others are currently working on a project (https://github.com/stdlib-js/stdlib) which aims to provide numeric computing facilities in JavaScript and Node.js.