Many people know about the Node.js built-in cluster
module and benefit from it, with a few lines of additional code, we can exploit all CPU powers for our server applications. However, there is one flaw, it consumes massive system memories because each worker process starts its own node runtime.
Also, many people know about the Node.js worker_threads
module, but very few people benefit from it because not knowing what’s the use of it. As suggested, worker threads are meant to run CPU-intensive tasks or to divide tasks among multiple threads, but due to the asynchronous nature of Node.js and its community packages, there aren’t many CPU-intensive tasks left for us.
But what if I say, there is another way of using worker_threads
, it achieves the functionality that the cluster
module provides? Isn’t it interesting? Yes, we can do that.
But first, let me just show the example code of using cluster
module, and then present the worker_threads
version, so we can compare them in detail.
The cluster version
// cluster.mjs
import * as http from "node:http";
import { availableParallelism } from "node:os";
import cluster from "node:cluster";
if (cluster.isPrimary) {
const numCPUs = availableParallelism();
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
} else {
http.createServer((req, res) => {
res.writeHead(200);
res.end(`Hello World! (processId: ${process.pid})\n`);
}).listen(8000, "localhost", () => {
console.log(`Listening on http://localhost:${8000}/ (processId: ${process.pid})`);
});
}
JavaScriptNow run node cluster.mjs
, we will see something logged in the terminal like this:
Listening on http://localhost:8000/ (processId: 66660)
Listening on http://localhost:8000/ (processId: 66661)
Listening on http://localhost:8000/ (processId: 66655)
Listening on http://localhost:8000/ (processId: 66656)
Listening on http://localhost:8000/ (processId: 66658)
Listening on http://localhost:8000/ (processId: 66657)
Listening on http://localhost:8000/ (processId: 66659)
Listening on http://localhost:8000/ (processId: 66662)
BashAnd if we run curl http://localhost:8000
several times, we will see something like this:
Hello World! (processId: 66895)
Hello World! (processId: 66897)
Hello World! (processId: 66896)
Hello World! (processId: 66893)
BashThis shows that the loads are distributed across different processes as expected. Now let’s go to the real beast.
The worker_threads version
// threads.mjs
import * as http from "node:http";
import { availableParallelism } from "node:os";
import { isMainThread, Worker, workerData, threadId } from "node:worker_threads";
import { fileURLToPath } from "node:url";
/** @type {http.RequestListener} */
const listener = (req, res) => {
res.writeHead(200);
res.end(`Hello World! (threadId: ${threadId})\n`);
};
if (isMainThread) {
const server = http.createServer(listener);
server.listen(8000, () => {
console.log(`Listening on http://localhost:${8000}/ (threadId: ${threadId})`);
const maxWorkers = availableParallelism() - 1;
for (let i = 0; i < maxWorkers; i++) {
new Worker(fileURLToPath(import.meta.url), {
workerData: { handle: { fd: server._handle.fd } }
});
}
});
} else {
http.createServer(listener).listen(workerData.handle, () => {
console.log(`Listening on http://localhost:${8000}/ (threadId: ${threadId})`);
});
}
JavaScriptNow run node threads.mjs
, we will see something logged in the terminal like this:
Listening on http://localhost:8000/ (threadId: 0)
Listening on http://localhost:8000/ (threadId: 4)
Listening on http://localhost:8000/ (threadId: 6)
Listening on http://localhost:8000/ (threadId: 2)
Listening on http://localhost:8000/ (threadId: 7)
Listening on http://localhost:8000/ (threadId: 3)
Listening on http://localhost:8000/ (threadId: 5)
Listening on http://localhost:8000/ (threadId: 1)
BashAnd if we run curl http://localhost:8000
several times, we will see something like this:
Hello World! (threadId: 7)
Hello World! (threadId: 4)
Hello World! (threadId: 4)
Hello World! (threadId: 6)
BashThis shows that the loads are distributed across different threads as expected.
Comparison
Listen port
With cluster
, we listen to the same port in each worker the same as writing a single-threaded server. But this is an illusion, we cannot actually listen to the same port in multiple processes. Node.js has this done by internally and secretly binding the port in the main process and distributing loads via the round-robin algorithm.
With worker_threads
, we need to explicitly bind the port in the main thread, however, instead of binding the same port in the worker thread, we pass the server’s file descriptor (server._handle.fd
) to the worker thread, and bind to the already existing file descriptor instead.
Accept requests
With cluster
, Node.js uses the round-robin algorithm to balance loads, we can see from the above example, that each request is handled by a different sub-process.
However, with worker_threads
, loads are not so well balanced, some request is handled by the same thread as the previous request. But this won’t matter, all threads will be used eventually once requests become frequent. Also, be noticed that unlike cluster
, with worker_threads
, the main thread also handles requests, so we only need to create cpus - 1
workers.
Memory usage
With cluster
, each worker is an individual process, which means it starts an individual Node.js runtime and takes additional system memory, and it multiplies as the CPU cores go up.
With worker_threads
, there is only one process, no additional memory is required.
Benchmark
As I’ve tested, the two versions perform roughly the same and handle about the same amount of requests per second. I use the following command for benchmarking on my laptop:
autocannon -c 1000 -d 10 http://localhost:8000
BashThe results are similar:
With cluster
, the average req/sec is 69819, and the average latency is 14ms.
With worker_threads
, the average req/sec is 69323, and the average latency is also 14ms.
Conclusion
Yes, we can use worker threads to achieve cluster behavior in Node.js and it saves a lot of system resources, which also means we can lower our server budget and still have the same performance as the traditional cluster model.