Supercharge Node.js and Deno with Rust

Preface

First of all, don’t get me wrong, JavaScript is pretty fast on its own, with and appropriate strategy and algorithm, JS programs can even outperform Rust in some cases. How did I get this impression? Let me explain.

I was trying to implement a search function that uses Levenshtein distance to search posts among massive candidates. The process is simple, providing a new post with a title and content, searching existing records of about 12k entries, to find similar ones, and to check if a similar post already exists. This is a typical CPU-intensive task.

I know Rust is fast and is perfect for this job. But how fast? And how many times faster than the JS version?

You may be surprised, as I was, that most Rust implementations of the Levenshtein distance function are slower than the one in JS from @std/text, at least for large string comparisons. For example, the strsim crate, which has 61m recent downloads, is 4 times slower than the JS version. The JS version took about 4.5 seconds to find the match, while the Rust (strsim) version took 17 seconds; that’s absurd.

I used the keyword levenshtein to search packages on crates.io, most packages were even worse than strsim, ranging from 4 – 9 times worse than the JS version. So now you get the idea. JS is pretty fast on its own, with a proper algorithm.

Finally, I was able to find a crate that matches the JS implementation, called rapidfuzz, which only has 41k recent downloads, so sad… However, the Python version of RapidFuzz is very popular according to Google AI, so the Rust version should be as good as well. And it is.

Initialize the project

As of today, integrating Rust code into Node.js/Deno projects is pretty easy with napi-rs, way easier than N-API with C++. Napi-rs literally packs everything for us, from compiling Rust code to generating JavaScript and TypeScript files; all we have to do is write some Rust functions, and the rest is taken care of.

To initialize a napi-rs package, we will need to first install its CLI tool @napi-rs/cli, along with yarn, which is used internally by the CLI tool to install JS dependencies. However, we could remove yarn afterwards if we don’t want to use it. And that’s what I did, I use Deno instead.

npm -i g @napi-rs/cli yarn # install CLI tools
napi new                   # initialize a new Node.js addon package

Bash

The CLI program will prompt us to name our new package and choose which targets/platforms we need to support (since we are writing a native package). The default targets are x86_64-apple-darwin, x86_64-pc-windows-msvc, and x86_64-unknown-linux-gnu. Since I’m using an M1 chip MacBook, I also selected aarch64-apple-darwin.

The CLI tool then generates all necessary files and folders for us, along with some JS dependencies like ava for testing. Again, we can remove them if we don’t want to use them. As I did, I use Deno instead.

Create the Rust search function

The major work will be located in the src/lib.rs file, as this program isn’t complicated, I will place all the code here. The CLI tool already generated an example sum function in this file, as it was not used, I deleted it, and only kept three lines of the generated code.

// src/lib.rs
#![deny(clippy::all)]

#[macro_use]
extern crate napi_derive;

// src/lib.rs
#![deny(clippy::all)]

#[macro_use]
extern crate napi_derive;

Rust

Next, I defined some data structures for the search function.

#[napi(object)]
#[derive(Debug, Clone)]
pub struct PostData {
    pub title: String,
    pub content: String,
}

#[napi(object)]
#[derive(Debug, Clone)]
pub struct Match {
    pub target: PostData,
    pub score: f64,
}

#[napi(object)]
#[derive(Debug, Clone)]
pub struct FindTopNResult {
    pub matches: Vec<Match>,
    pub process_time: i64, // how many time is used for processing
}

#[napi(object)]
#[derive(Debug, Clone)]
pub struct PostData {
    pub title: String,
    pub content: String,
}

#[napi(object)]
#[derive(Debug, Clone)]
pub struct Match {
    pub target: PostData,
    pub score: f64,
}

#[napi(object)]
#[derive(Debug, Clone)]
pub struct FindTopNResult {
    pub matches: Vec<Match>,
    pub process_time: i64, // how many time is used for processing
}

Rust

Then, a function to get the weights of each field in the PostData:

use napi::{Error, Result};

// ...

fn get_weights(source: &PostData) -> Result<(f64, f64)> {
    let title_chars = source.title.chars().count();
    let content_chars = source.content.chars().count();
    let total_chars = title_chars + content_chars;

    if total_chars == 0 {
        Err(Error::from_reason("source is invalid"))
    } else {
        Ok((
            title_chars as f64 / total_chars as f64,
            content_chars as f64 / total_chars as f64,
        ))
    }
}

use napi::{Error, Result};

// ...

fn get_weights(source: &PostData) -> Result<(f64, f64)> {
    let title_chars = source.title.chars().count();
    let content_chars = source.content.chars().count();
    let total_chars = title_chars + content_chars;

    if total_chars == 0 {
        Err(Error::from_reason("source is invalid"))
    } else {
        Ok((
            title_chars as f64 / total_chars as f64,
            content_chars as f64 / total_chars as f64,
        ))
    }
}

Rust

Finally, the search function:

use napi::{Error, Result};
use rapidfuzz::distance::levenshtein::normalized_similarity;

// ...

#[napi]
pub fn find_similar_posts_native(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(&source)?;
    let mut matches = vec![];

    for candidate in candidates.into_iter() {
        let title_score =
            normalized_similarity(source.title.chars(), candidate.title.chars()) * title_weight;
        let content_score =
            normalized_similarity(source.content.chars(), candidate.content.chars())
                * content_weight;
        let score = title_score + content_score;

        // 0.5 is the threshold to consider a match
        if score > 0.5 {
            matches.push(Match {
                target: candidate,
                score,
            });
        }
    }

    matches.sort_by(|a, b| {
        let order = b.score.partial_cmp(&a.score);
        order.unwrap_or(std::cmp::Ordering::Equal)
    });

    let top_n_matches = if matches.len() <= top_n as usize {
        matches
    } else {
        matches.into_iter().take(top_n as usize).collect()
    };
    let duration = start.elapsed();

    Ok(FindTopNResult {
        matches: top_n_matches,
        process_time: duration.as_millis() as i64,
    })
}

use napi::{Error, Result};
use rapidfuzz::distance::levenshtein::normalized_similarity;

// ...

#[napi]
pub fn find_similar_posts_native(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(&source)?;
    let mut matches = vec![];

    for candidate in candidates.into_iter() {
        let title_score =
            normalized_similarity(source.title.chars(), candidate.title.chars()) * title_weight;
        let content_score =
            normalized_similarity(source.content.chars(), candidate.content.chars())
                * content_weight;
        let score = title_score + content_score;

        // 0.5 is the threshold to consider a match
        if score > 0.5 {
            matches.push(Match {
                target: candidate,
                score,
            });
        }
    }

    matches.sort_by(|a, b| {
        let order = b.score.partial_cmp(&a.score);
        order.unwrap_or(std::cmp::Ordering::Equal)
    });

    let top_n_matches = if matches.len() <= top_n as usize {
        matches
    } else {
        matches.into_iter().take(top_n as usize).collect()
    };
    let duration = start.elapsed();

    Ok(FindTopNResult {
        matches: top_n_matches,
        process_time: duration.as_millis() as i64,
    })
}

Rust

Then run deno task build to build the Rust code into a Node.js addon, which generates an index.js and an index.d.ts file, along with a platform-specific .node file. Napi-rs still emits CommonJS files, so I need to modify the package.json file and add "type": "commonjs". So Deno won’t treat JS files as ES modules.

Create the JS counterpart

The Rust function has a suffix _native because I’ve also created a JS version of it, in order to compare how much faster the Rust version is compared to the JS version.

// ts/mod.ts
import { levenshteinDistance } from "@std/text"
import { chars } from "@ayonli/jsext/string"
// @deno-types="../index.d.ts"
import { Match, PostData, FindTopNResult } from "../index.js"

// @std/text doesn't provide a normalized function, we implement our own
function normalizedSimilarity(a: string, b: string): number {
    if (a === b) {
        return 1
    }

    const aChars = chars(a).length
    const bChars = chars(b).length
    return 1 - levenshteinDistance(a, b) / Math.max(aChars, bChars)
}

function getWeights(source: PostData): [number, number] {
    const titleChars = chars(source.title).length
    const contentChars = chars(source.content).length
    const totalChars = titleChars + contentChars

    if (totalChars === 0) {
        throw new Error("Source is invalid")
    }

    return [
        titleChars / totalChars,
        contentChars / totalChars,
    ]
}

export function findSimilarPosts(source: PostData, candidates: PostData[], topN: number): FindTopNResult {
    const start = Date.now()
    const [titleWeight, contentWeight] = getWeights(source)
    const matches: Match[] = []

    for (const candidate of candidates) {
        const titleScore = normalizedSimilarity(source.title, candidate.title) * titleWeight
        const contentScore = normalizedSimilarity(source.content, candidate.content) * contentWeight
        const score = titleScore + contentScore

        // 0.5 is the threshold to consider a match
        if (score > 0.5) {
            matches.push({
                target: candidate,
                score,
            })
        }
    }

    matches.sort((a, b) => b.score - a.score)

    return {
        matches: matches.length <= topN ? matches : matches.slice(0, topN),
        processTime: Date.now() - start,
    }
}

// ts/mod.ts
import { levenshteinDistance } from "@std/text"
import { chars } from "@ayonli/jsext/string"
// @deno-types="../index.d.ts"
import { Match, PostData, FindTopNResult } from "../index.js"

// @std/text doesn't provide a normalized function, we implement our own
function normalizedSimilarity(a: string, b: string): number {
    if (a === b) {
        return 1
    }

    const aChars = chars(a).length
    const bChars = chars(b).length
    return 1 - levenshteinDistance(a, b) / Math.max(aChars, bChars)
}

function getWeights(source: PostData): [number, number] {
    const titleChars = chars(source.title).length
    const contentChars = chars(source.content).length
    const totalChars = titleChars + contentChars

    if (totalChars === 0) {
        throw new Error("Source is invalid")
    }

    return [
        titleChars / totalChars,
        contentChars / totalChars,
    ]
}

export function findSimilarPosts(source: PostData, candidates: PostData[], topN: number): FindTopNResult {
    const start = Date.now()
    const [titleWeight, contentWeight] = getWeights(source)
    const matches: Match[] = []

    for (const candidate of candidates) {
        const titleScore = normalizedSimilarity(source.title, candidate.title) * titleWeight
        const contentScore = normalizedSimilarity(source.content, candidate.content) * contentWeight
        const score = titleScore + contentScore

        // 0.5 is the threshold to consider a match
        if (score > 0.5) {
            matches.push({
                target: candidate,
                score,
            })
        }
    }

    matches.sort((a, b) => b.score - a.score)

    return {
        matches: matches.length <= topN ? matches : matches.slice(0, topN),
        processTime: Date.now() - start,
    }
}

TypeScript

Execute and compare

Now that I have two versions of the search function, it’s time to execute them and compare the performance.

// main.mts :Have to use .mts since this is a CommonJS package
import { deepStrictEqual, strictEqual } from "node:assert"
import { parse } from "@std/csv"
import { findSimilarPosts } from "./ts/mod.ts"
// @ts-types="./index.d.ts"
import {
    findSimilarPostsNative,
    Match,
    PostData,
} from "./index.js"

const filename = Deno.cwd() + "/assets/blogs.csv" // the archive of blog posts
const contents = await Deno.readTextFile(filename)
const records = parse(contents, { skipFirstRow: true }) as { title: string, text: string }[]

const posts: PostData[] = records.map(r => ({ title: r.title, content: r.text }))
console.log("Existing posts:", posts.length)

const newPost: PostData = {
    title: "The economic theories of MLA",
    // The archived version has some non-unicode characters, this one doesn't, but don't worry,
    // the search function will be able to match it.
    content: `
A resolution calling for full-and-part-time faculty members to be eligible for tenure and expressing
the view that all higher education employees should have appropriate forms of job security, due
process, a living wage and access to health care benefits passed in a 81-15 vote, but not without
concerns from delegates that the wording went too far or not far enough.

Ian Barnard, an associate professor of English at California State University-Northridge, said he
wanted to see the resolution extended to include a call for all faculty to be eligible not only for
tenure but also for full-time employment. Simply voicing support for a lecturer to continue to be
guaranteed one course per semester was, he said, really weak a way for us to cop out, for
departments to avoid paying for health benefits and for adjunct faculty to continue bouncing around
among many jobs just to make ends meet.

The full story is here. Why don't journalists demand something similar? You can pinch yourself, but
it really is 2010. By the way, here are some facts:

In 1960, 75 percent of college instructors were full-time tenured or tenure-track professors; today
only 27 percent are.
    `
}

interface FunctionResult {
    fn_name: string;
    call_time_ms: number;
    process_time_ms: number;
    data_passing_ms: number;
}

const results: FunctionResult[] = []
let match: Match | undefined

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPosts(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPosts",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    strictEqual(matches[0].target.title, "The economic theories of the MLA")
    strictEqual(matches[0].score > 0.95, true)
    match = matches[0]
}

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPostsNative(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNative",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

// main.mts :Have to use .mts since this is a CommonJS package
import { deepStrictEqual, strictEqual } from "node:assert"
import { parse } from "@std/csv"
import { findSimilarPosts } from "./ts/mod.ts"
// @ts-types="./index.d.ts"
import {
    findSimilarPostsNative,
    Match,
    PostData,
} from "./index.js"

const filename = Deno.cwd() + "/assets/blogs.csv" // the archive of blog posts
const contents = await Deno.readTextFile(filename)
const records = parse(contents, { skipFirstRow: true }) as { title: string, text: string }[]

const posts: PostData[] = records.map(r => ({ title: r.title, content: r.text }))
console.log("Existing posts:", posts.length)

const newPost: PostData = {
    title: "The economic theories of MLA",
    // The archived version has some non-unicode characters, this one doesn't, but don't worry,
    // the search function will be able to match it.
    content: `
A resolution calling for full-and-part-time faculty members to be eligible for tenure and expressing
the view that all higher education employees should have appropriate forms of job security, due
process, a living wage and access to health care benefits passed in a 81-15 vote, but not without
concerns from delegates that the wording went too far or not far enough.

Ian Barnard, an associate professor of English at California State University-Northridge, said he
wanted to see the resolution extended to include a call for all faculty to be eligible not only for
tenure but also for full-time employment. Simply voicing support for a lecturer to continue to be
guaranteed one course per semester was, he said, really weak a way for us to cop out, for
departments to avoid paying for health benefits and for adjunct faculty to continue bouncing around
among many jobs just to make ends meet.

The full story is here. Why don't journalists demand something similar? You can pinch yourself, but
it really is 2010. By the way, here are some facts:

In 1960, 75 percent of college instructors were full-time tenured or tenure-track professors; today
only 27 percent are.
    `
}

interface FunctionResult {
    fn_name: string;
    call_time_ms: number;
    process_time_ms: number;
    data_passing_ms: number;
}

const results: FunctionResult[] = []
let match: Match | undefined

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPosts(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPosts",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    strictEqual(matches[0].target.title, "The economic theories of the MLA")
    strictEqual(matches[0].score > 0.95, true)
    match = matches[0]
}

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPostsNative(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNative",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

TypeScript

After running the program (with deno run -A main.mts), I got these results:

fn_name	call_time_ms	process_time_ms	data_passing_ms
findSimilarPosts	4584	4584	0
findSimilarPostsNative	1196	1149	47

Pretty good, the Rust version is about 4 times faster than the JS version, thanks to its native performance, although there is some overhead of passing data to the Rust side from JS.

Parallel processing in Rust

But this is a CPU-intensive task, it would be pointless to use Rust if we don’t use the parallel processing ability. So I’ve also created another search function in Rust, utilizing parallel processing with multiple threads. Parallel processing in Rust is very easy; all I have to do is use the rayon crate and call the par_iter function.

use rayon::iter::{IntoParallelRefIterator, ParallelIterator};

// ...

#[napi]
fn find_similar_posts_native_parallel(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(&source)?;

    let mut matches: Vec<Match> = candidates
        .par_iter() // Parallel is as simple as just calling a function
        .filter_map(|candidate| {
            let title_score =
                normalized_similarity(source.title.chars(), candidate.title.chars()) * title_weight;
            let content_score =
                normalized_similarity(source.content.chars(), candidate.content.chars())
                    * content_weight;
            let score = title_score + content_score;

            if score > 0.5 {
                Some(Match {
                    target: candidate.clone(),
                    score,
                })
            } else {
                None
            }
        })
        .collect();

    matches.sort_by(|a, b| {
        let order = b.score.partial_cmp(&a.score);
        order.unwrap_or(std::cmp::Ordering::Equal)
    });

    let top_n_matches = if matches.len() <= top_n as usize {
        matches
    } else {
        matches.into_iter().take(top_n as usize).collect()
    };
    let duration = start.elapsed();

    Ok(FindTopNResult {
        matches: top_n_matches,
        process_time: duration.as_millis() as i64,
    })
}

use rayon::iter::{IntoParallelRefIterator, ParallelIterator};

// ...

#[napi]
fn find_similar_posts_native_parallel(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(&source)?;

    let mut matches: Vec<Match> = candidates
        .par_iter() // Parallel is as simple as just calling a function
        .filter_map(|candidate| {
            let title_score =
                normalized_similarity(source.title.chars(), candidate.title.chars()) * title_weight;
            let content_score =
                normalized_similarity(source.content.chars(), candidate.content.chars())
                    * content_weight;
            let score = title_score + content_score;

            if score > 0.5 {
                Some(Match {
                    target: candidate.clone(),
                    score,
                })
            } else {
                None
            }
        })
        .collect();

    matches.sort_by(|a, b| {
        let order = b.score.partial_cmp(&a.score);
        order.unwrap_or(std::cmp::Ordering::Equal)
    });

    let top_n_matches = if matches.len() <= top_n as usize {
        matches
    } else {
        matches.into_iter().take(top_n as usize).collect()
    };
    let duration = start.elapsed();

    Ok(FindTopNResult {
        matches: top_n_matches,
        process_time: duration.as_millis() as i64,
    })
}

Rust

Parallel processing in JS

Some people may be familiar with Web Workers (or worker_threads), which give JS the ability to run another JS program in another thread. We can use them to process multiple tasks in parallel as well.

In order to compare with the Rust version, I’ve also created a JS version of the parallel search function:

// ts/parallel.ts
import { chunk } from "@ayonli/jsext/array"
import { avg } from "@ayonli/jsext/math"
import parallel from "@ayonli/jsext/parallel"
// @deno-types="../index.d.ts"
import { FindTopNResult, PostData } from "../index.js"
const { findSimilarPosts } = parallel(() => import("./mod.ts"))

export async function findSimilarPostsParallel(
    source: PostData,
    candidates: PostData[],
    topN: number
): Promise<FindTopNResult> {
    const chunks = chunk(candidates, Math.ceil(candidates.length / navigator.hardwareConcurrency))
    const processTimes: number[] = []
    let results = (await Promise.all(chunks.map(chunk => findSimilarPosts(source, chunk, topN))))
        .flatMap(e => {
            processTimes.push(e.processTime)
            return e.matches
        })
        .toSorted((a, b) => b.score - a.score)

    if (results.length > topN) {
        results = results.slice(0, topN)
    }

    return {
        matches: results,
        processTime: Math.round(avg(...processTimes)),
    }
}

// ts/parallel.ts
import { chunk } from "@ayonli/jsext/array"
import { avg } from "@ayonli/jsext/math"
import parallel from "@ayonli/jsext/parallel"
// @deno-types="../index.d.ts"
import { FindTopNResult, PostData } from "../index.js"
const { findSimilarPosts } = parallel(() => import("./mod.ts"))

export async function findSimilarPostsParallel(
    source: PostData,
    candidates: PostData[],
    topN: number
): Promise<FindTopNResult> {
    const chunks = chunk(candidates, Math.ceil(candidates.length / navigator.hardwareConcurrency))
    const processTimes: number[] = []
    let results = (await Promise.all(chunks.map(chunk => findSimilarPosts(source, chunk, topN))))
        .flatMap(e => {
            processTimes.push(e.processTime)
            return e.matches
        })
        .toSorted((a, b) => b.score - a.score)

    if (results.length > topN) {
        results = results.slice(0, topN)
    }

    return {
        matches: results,
        processTime: Math.round(avg(...processTimes)),
    }
}

TypeScript

This function calls the original findSimilarPosts under the hood, which may introduce a little duplicated work, but it’s so tiny that we can ignore it.

After rebuilding the Rust code, in main.mts, I added another two tests for the new functions:

import { findSimilarPostsParallel } from "./ts/parallel.ts"
// @ts-types="./index.d.ts"
import {
    findSimilarPostsNativeParallel,
    // ...
} from "./index.js"

// ...

{
    const start = Date.now()
    const { matches, processTime } = await findSimilarPostsParallel(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsParallel",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPostsNativeParallel(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNativeParallel",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

import { findSimilarPostsParallel } from "./ts/parallel.ts"
// @ts-types="./index.d.ts"
import {
    findSimilarPostsNativeParallel,
    // ...
} from "./index.js"

// ...

{
    const start = Date.now()
    const { matches, processTime } = await findSimilarPostsParallel(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsParallel",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

{
    const start = Date.now()
    const { matches, processTime } = findSimilarPostsNativeParallel(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNativeParallel",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

TypeScript

After reruning the program, new results are printed:

fn_name	call_time_ms	process_time_ms	data_passing_ms
findSimilarPosts	4669	4669	0
findSimilarPostsNative	1188	1141	47
findSimilarPostsParallel	1214	1009	205
findSimilarPostsNativeParallel	277	229	48

This is exciting. The search operation time is reduced from about 4.6 seconds to just 277 milliseconds, we’re able to boost the program about 17 times faster with native parallel in Rust.

And with Web Workers, we’re also able to improve the processing performance by about 4 times faster than the JS single-threaded version, and reach roughly the same speed as the Rust single-threaded version.

However, it’s important to point out that Web Workers (or worker_threads) are not like regular native threads, they are isolated v8 instances, each one acts just like another Node.js instance that consume a lot of memory, and data passing between the main thread and worker threads are very slow due to the need to deep clone and serialize/deserialize data.

AsyncTask

Although the Rust parallel search function is pretty fast, it has a flaw: it blocks the main thread. Even though we’re using multithreading here, while waiting for the threads to complete, the main thread will be stuck, and no other code can be run. This is not idiomatic JS. We need to fix it.

Luckily, Napi-rs provides an AsyncTask API, which allows us to run blocking code in the libuv thread pool, and frees us from blocking the main thread.

First, we need to refine the parallel search function for reuse with the AsyncTask:

#[napi]
fn find_similar_posts_native_parallel(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    do_find_similar_posts_native_parallel(&source, &candidates, top_n)
}

fn do_find_similar_posts_native_parallel(
    source: &PostData,
    candidates: &Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(source)?;
    // ...
}

#[napi]
fn find_similar_posts_native_parallel(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    do_find_similar_posts_native_parallel(&source, &candidates, top_n)
}

fn do_find_similar_posts_native_parallel(
    source: &PostData,
    candidates: &Vec<PostData>,
    top_n: u32,
) -> Result<FindTopNResult> {
    let start = Instant::now();
    let (title_weight, content_weight) = get_weights(source)?;
    // ...
}

Rust

Next, create a structure that implements the Task trait that will be used by the AsyncTask, and an async version of the search function:

use napi::{bindgen_prelude::AsyncTask, Env, Error, Result, Task};

// ...

pub struct AsyncFindSimilarPosts {
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
}

#[napi]
impl Task for AsyncFindSimilarPosts {
    type Output = FindTopNResult;
    type JsValue = FindTopNResult;

    fn compute(&mut self) -> Result<Self::Output> {
        do_find_similar_posts_native_parallel(&self.source, &self.candidates, self.top_n)
    }

    fn resolve(&mut self, _env: Env, output: Self::Output) -> Result<Self::JsValue> {
        Ok(output)
    }
}

#[napi]
pub fn find_similar_posts_native_async(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> AsyncTask<AsyncFindSimilarPosts> {
    AsyncTask::new(AsyncFindSimilarPosts {
        source,
        candidates,
        top_n,
    })
}

use napi::{bindgen_prelude::AsyncTask, Env, Error, Result, Task};

// ...

pub struct AsyncFindSimilarPosts {
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
}

#[napi]
impl Task for AsyncFindSimilarPosts {
    type Output = FindTopNResult;
    type JsValue = FindTopNResult;

    fn compute(&mut self) -> Result<Self::Output> {
        do_find_similar_posts_native_parallel(&self.source, &self.candidates, self.top_n)
    }

    fn resolve(&mut self, _env: Env, output: Self::Output) -> Result<Self::JsValue> {
        Ok(output)
    }
}

#[napi]
pub fn find_similar_posts_native_async(
    source: PostData,
    candidates: Vec<PostData>,
    top_n: u32,
) -> AsyncTask<AsyncFindSimilarPosts> {
    AsyncTask::new(AsyncFindSimilarPosts {
        source,
        candidates,
        top_n,
    })
}

Rust

And then update main.mts, add one more test for the new function:

// @ts-types="./index.d.ts"
import {
    findSimilarPostsNativeAsync,
    // ...
} from "./index.js"

// ...

{
    const start = Date.now()
    const { matches, processTime } = await findSimilarPostsNativeAsync(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNativeAsync",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

// @ts-types="./index.d.ts"
import {
    findSimilarPostsNativeAsync,
    // ...
} from "./index.js"

// ...

{
    const start = Date.now()
    const { matches, processTime } = await findSimilarPostsNativeAsync(newPost, posts, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "findSimilarPostsNativeAsync",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

console.table(results)

TypeScript

Rerun the program, and new results are printed:

fn_name	call_time_ms	process_time_ms	data_passing_ms
findSimilarPosts	4597	4597	0
findSimilarPostsNative	1191	1144	47
findSimilarPostsParallel	1171	1008	163
findSimilarPostsNativeParallel	271	223	48
findSimilarPostsNativeAsync	276	223	53

This is very good, now we have a search function that is native, parallel, asynchronous, and 17 times faster than the original JS version.

Preloaded version

From the above results, we can see that Rust actually only took 223 ms to process the search operation; it took the program about 50 ms just to pass the data from the JS side to the Rust side. Since JS and Rust have different data structures and memory layouts, JS objects are cloned and converted into equivalent Rust structure instances, which will take some time.

We can fix this problem by preloading the candidate posts into a store, and performing the search function on it:

// src/lib.rs
pub mod store;

// src/lib.rs
pub mod store;

Rust

// src/store.rs
use std::sync::{Arc, RwLock};

use napi::{bindgen_prelude::AsyncTask, Env, Error, Result, Task};

use crate::{do_find_similar_posts_native_parallel, FindTopNResult, PostData};

#[napi]
pub struct PostStore {
    posts: Arc<RwLock<Vec<PostData>>>,
}

#[napi]
impl PostStore {
    #[napi(constructor)]
    pub fn new() -> Self {
        PostStore {
            posts: Arc::new(RwLock::new(Vec::new())),
        }
    }

    #[napi]
    pub fn preload(&self, posts: Vec<PostData>) -> Result<()> {
        match self.posts.write() {
            Ok(mut _posts) => {
                *_posts = posts;
                Ok(())
            }
            Err(e) => Err(Error::from_reason(format!(
                "Failed to preload posts: {}",
                e
            ))),
        }
    }

    #[napi(ts_return_type = "Promise<FindTopNResult>")]
    pub fn find_similar_posts(
        &self,
        source: PostData,
        top_n: u32,
    ) -> AsyncTask<AsyncFindSimilarPosts> {
        AsyncTask::new(AsyncFindSimilarPosts {
            source,
            posts: self.posts.clone(),
            top_n,
        })
    }
}

pub struct AsyncFindSimilarPosts {
    source: PostData,
    posts: Arc<RwLock<Vec<PostData>>>,
    top_n: u32,
}

#[napi]
impl Task for AsyncFindSimilarPosts {
    type Output = FindTopNResult;
    type JsValue = FindTopNResult;

    fn compute(&mut self) -> Result<Self::Output> {
        match self.posts.read() {
            Ok(posts) => do_find_similar_posts_native_parallel(&self.source, &posts, self.top_n),
            Err(e) => Err(Error::from_reason(format!("Failed to read posts: {}", e))),
        }
    }

    fn resolve(&mut self, _env: Env, output: Self::Output) -> Result<Self::JsValue> {
        Ok(output)
    }
}

// src/store.rs
use std::sync::{Arc, RwLock};

use napi::{bindgen_prelude::AsyncTask, Env, Error, Result, Task};

use crate::{do_find_similar_posts_native_parallel, FindTopNResult, PostData};

#[napi]
pub struct PostStore {
    posts: Arc<RwLock<Vec<PostData>>>,
}

#[napi]
impl PostStore {
    #[napi(constructor)]
    pub fn new() -> Self {
        PostStore {
            posts: Arc::new(RwLock::new(Vec::new())),
        }
    }

    #[napi]
    pub fn preload(&self, posts: Vec<PostData>) -> Result<()> {
        match self.posts.write() {
            Ok(mut _posts) => {
                *_posts = posts;
                Ok(())
            }
            Err(e) => Err(Error::from_reason(format!(
                "Failed to preload posts: {}",
                e
            ))),
        }
    }

    #[napi(ts_return_type = "Promise<FindTopNResult>")]
    pub fn find_similar_posts(
        &self,
        source: PostData,
        top_n: u32,
    ) -> AsyncTask<AsyncFindSimilarPosts> {
        AsyncTask::new(AsyncFindSimilarPosts {
            source,
            posts: self.posts.clone(),
            top_n,
        })
    }
}

pub struct AsyncFindSimilarPosts {
    source: PostData,
    posts: Arc<RwLock<Vec<PostData>>>,
    top_n: u32,
}

#[napi]
impl Task for AsyncFindSimilarPosts {
    type Output = FindTopNResult;
    type JsValue = FindTopNResult;

    fn compute(&mut self) -> Result<Self::Output> {
        match self.posts.read() {
            Ok(posts) => do_find_similar_posts_native_parallel(&self.source, &posts, self.top_n),
            Err(e) => Err(Error::from_reason(format!("Failed to read posts: {}", e))),
        }
    }

    fn resolve(&mut self, _env: Env, output: Self::Output) -> Result<Self::JsValue> {
        Ok(output)
    }
}

Rust

And update the main.mts to add a new test:

// @ts-types="./index.d.ts"
import {
    // ...
    PostStore,
} from "./index.js"

// ...

{
    const store = new PostStore()
    store.preload(posts)

    const start = Date.now()
    const { matches, processTime } = await store.findSimilarPosts(newPost, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "PostStore#findSimilarPosts",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

// @ts-types="./index.d.ts"
import {
    // ...
    PostStore,
} from "./index.js"

// ...

{
    const store = new PostStore()
    store.preload(posts)

    const start = Date.now()
    const { matches, processTime } = await store.findSimilarPosts(newPost, 3)
    const callTime = Date.now() - start
    results.push({
        fn_name: "PostStore#findSimilarPosts",
        call_time_ms: callTime,
        process_time_ms: processTime,
        data_passing_ms: callTime - processTime,
    })
    strictEqual(matches.length, 1)
    deepStrictEqual(matches[0], match)
}

TypeScript

And after rerun, we have refined results:

fn_name	call_time_ms	process_time_ms	data_passing_ms
findSimilarPosts	4754	4754	0
findSimilarPostsNative	1208	1161	47
findSimilarPostsParallel	1257	1030	227
findSimilarPostsNativeParallel	274	226	48
findSimilarPostsNativeAsync	271	223	48
PostStore#findSimilarPosts	225	225	0

This is great, now the program is able to perform over 20 times faster than the original JS version.

Conclusion

Node.js is still a great choice (Deno is greater) for IO-intensive tasks thanks to its asynchronous design at heart, and the JS/TS language is very flexible, especially in terms of prototyping and fast-paced development. However, when it comes to CPU-intensive tasks, the options for JavaScript are very limited.

Although we can use Web Workers (or worker_threads) for parallel processing, they are insufficient since workers cannot share memory, passing messages is very inefficient between threads, almost making it impossible to use Web Workers for serious CPU-intensive tasks.

Rust is a modern, safe, and expressive programming language with an intelligent type system, and we can find many familiar designs in Rust from the JS/TS perspective, like property shorthands, object destructuring, and code blocks. JS/TS developers will definitely appreciate the similarities between the two languages.

That said, Rust is still a very hard language to master, making it impractical to rewrite everything in Rust or even to start a new project fully in Rust. However, with napi-rs, we can integrate Rust into our Node.js and Deno workflows, making it a perfect balance between the flexibility of JavaScript and the performance of Rust.

PS: The full example of the program can be found here.

Preface

Initialize the project

Create the Rust search function

Create the JS counterpart

Execute and compare

Parallel processing in Rust

Parallel processing in JS

AsyncTask

Preloaded version

Conclusion

Leave a commentCancel reply