Posted by manning_pubs
on February 18, 2013 at 8:15 AM PST
Node in a Nutshell
by Alex Young and Marc Harter, authors of Node.js in Practice
We live in a world of highly connected multicore servers, where web applications are expected to scale from dozens of users to millions. New demands are being placed on developers by the real-time nature of the modern web. Developers are looking for fresh solutions to solve scalability issues—whether it’s to take advantage of multiple CPU cores and high I/O demands or to adapt programs to run on clusters of servers. This article, based on chapter 1 of Node.js in Practice, shows how Node fills a gap in the market by attacking the scalability problem head on.
By using an event-based model with a non-blocking core, Node is perfectly suited to the unpredictable nature of scaling I/O-bound applications.
Node has rapidly become a major platform for developing web applications and even UNIX and Windows programs. With support from cloud hosting providers and giants like Microsoft, Node’s future looks bright. The recent release of Node 0.8 has cemented several core features and improved support for Windows.
In this article, you’ll learn about what Node is and some of its history. You’ll also learn about Node’s release cycle, why Node is unique, what sets Node apart, and who’s using it. This should give you enough knowledge to know if Node is right for your projects.
Next we’ll take a brief look at Node, its runtime engine, and its main features.
What is Node?
Node’s module system is based on CommonJS Modules. Files can be loaded with require, and specific methods or objects can be exported using the exports object. Modules can be managed and shared by using npm, which is distributed alongside Node. A command-line tool is included which can be used to install, remove, upgrade, and search for modules. There’s also a website for npm that allows modules to be searched.
Node is released in a stable/unstable cycle—odd numbered releases are unstable. The latest stable version of Node is the 0.8 series, and the latest unstable release is 0.9. API changes between major versions of Node are relatively minimal.
Documentation on API changes can be found in the Node changelog, and on the project’s wiki.
In a typical programming language, an I/O operation blocks execution until it completes. Node’s asynchronous file and network APIs mean processing can still occur while these relatively slow I/O operations finish. For example, a network game server that broadcasts the game’s state to various players over TCP/IP sockets can perform background tasks, perhaps maintaining the game world, while it sends data to the players.
Figure 1 A complex loop locks the interpreter. Callbacks that would otherwise run asynchronously are completely blocked by the for-loop, even though they’re ready to run.
In figure 1, several callbacks are waiting to run due to events that have been triggered, but they can’t because a for-loop is madly iterating away. To get around this, programs must be designed to break up computationally intensive operations into smaller units of work that can be scaled.
Compare this to figure 2, which represents a refactored, event-based program.
Figure 2 A refactored, event-based program that has no for-loop blocking execution. Here callbacks are free to run when the resources they’re waiting for are ready.
Node and events are almost synonymous, and there’s a reason for that: events are integral to a well-designed Node program. In figure 2, the for-loop has been replaced with smaller chunks of work that can be scheduled alongside other event-based code. Since I/O should be non-blocking, event handlers can run while other code is waiting for I/O results. In general this means designing classes around small servers and EventEmitter, but the Node community has also been quick to adopt other approaches like the publish-subscribe pattern, which are supported by powerful backends of their own.
Another way Node compensates for this disadvantage is by providing a clustering module that allows separate Node processes to work together. A program designed this way is well positioned to take advantage of multicore processors.
What sets Node apart?
This core is libuv, which was created specifically for Node to better support IOCP for Windows and libev for UNIX. IOCP and libev provide asynchronous I/O that operate at a low level, and are used as the basis for Node’s high-performance event loop. The libuv library has the following main features:
- Non-blocking TCP sockets and named pipes.
- Child process management.
- Asynchronous DNS and file system APIs.
- High-resolution time.
- Thread pool scheduling.
- File system events.
- IPC and socket sharing.
Who’s using it?
Node is used by large and small businesses alike, but the early adopters were initially enthusiastic open source developers.
LinkedIn Mobile has been using Node to power key parts of its mobile stack. Community favorite GitHub uses it to efficiently manage download requests, alongside its existing Python and Ruby-based architecture. Microsoft has embraced Node and supports it as part of the Azure platform. There are also companies that have built their success on Node and heavily contributed back to the community. One such example is LearnBoost.
As another example, consider a web application that sits in front of another API server. A traditional web application would implement this by receiving input from users, then making a synchronous request to the API server, then responding to the user by rendering a web page. Table 1 shows the states that such a web application will go through as a request is made.
Table 1 The typical states that exist in a blocking web application
||Web app state
|Make internal request
|Receive internal request
||Internal API server
|Respond to internal request
||Internal API server
|Receive internal request
|Respond to user
If the internal API uses HTTP, then this could be extremely slow. A well-written Node replacement would not block at any step of this process. That means multiple requests can be handled by the same process. Node also comes with tools for running several processes, which means the Node solution could easily scale to take advantage of multiple CPU cores.
Node is an excellent choice for developing real-time proxy applications like this. That leads to other classes of applications that fit similar patterns where strong performance is desirable—statistics servers, game backends, on-the-fly data format conversion.
Comparing Node to related technologies
Table 2 How can I use Node in my job?
||Node has built-in HTTP client and server modules that can be used to create simple web applications out of the box.
||File System, net, DNS, process, os
||Node is ideal for creating programs that integrate with Unix systems, and for writing background daemons and servers for integration projects, network proxies, and so on.
Node is particularly beneficial to developers who work with network-oriented software. HTTP-based services are only one aspect of Node development. If you work with custom networking protocols, perhaps related to secure messaging, VoIP, or game servers, then Node provides an attractive alternative to low-level languages like C and C++. Table 3 shows some of these benefits alongside the related Node modules.
Table 3 How can Node benefit me?
||EventEmitter, Buffer, Stream, File System, net, DNS, HTTP
|| Most of Node’s modules are built around asynchronous, non-blocking I/O.
||net, DNS, HTTP
||Node is a convenient platform for writing network-oriented software.
||File System, Zlib
||Both synchronous and asynchronous file system APIs make reading and writing files simple and fast.
|OS integration and scripting
||os, Child Process, TTY, process
||Creating child processes, managing results, and calling OS-specific features is all catered for.
||Scaling out to multiple processes and managing errors is supported by Node’s core modules.
Here are some other Manning titles you might be interested in: