Tarball API in pure JS, no Node.js is needed

In my previous article <<Manage file in the browser in JS>>, I demonstrate that we can access the file system through JS in the browser, also there is an Origin Private File System in the browser itself. Today, I’m going to explore the possibility of JavaScript further, showing that JavaScript can process tarball files by itself, and we don’t need a server runtime to do so.

But of course, I won’t start from the very bottom, because processing a tarball file is a lot of work under the hood. I’ve created an abstraction of the Tarball API in the JsExt library, which we can use to manipulate Tarball files with ease.

The Tarball API is designed to be generic, it’s not a tool to archive files to a .tar file, or to extract files from a .tar file. Instead, a Tarball instance represents a tarball archive itself, we can add new files to it, remove files from it, and preview its entries, very similar to the Archive Manager application in many Linux distros.

The Tarball API doesn’t rely on any server-runtime-specific APIs, it only relies on modern Web APIs, mostly ReadableStream, which are available in all modern browsers and server runtimes such as Node.js, Deno, and Bun, as well as edge runtimes such as Cloudflare Workers. So it can be used in any JavaScript environment.

Import the Tarball API

To import the Tarball API, we can use the following methods, depending on our application setups.

// In Node.js, Deno (JSR), Bun, Cloudflare Workers, Browsers (with bundler or import map)
import { Tarball } from "@ayonli/jsext/archive";

// Or in Deno with URL import
import { Tarball } from "https://ayonli.github.io/jsext/archive.ts";

// Or in the browser with URL import
import { Tarball } from "https://ayonli.github.io/jsext/esm/archive.js";
TypeScript

Create a tarball instance

There are two ways to create a Tarball instance, one is to use the new keyword to initiate an empty tarball, and the other is to load a .tar file with the static Tarball.load method.

Initiate an empty tarball

const tarball = new Tarball();
TypeScript

Load a tar file

The Tarball.load method accepts a readable stream instead of a file path or URL, meaning that it can load the tar file from anywhere. For example, from the file system:

// Load from the file system (even the browser's OPFS)
import { createReadableStream } from "@ayonli/jsext/fs";

const input = createReadableStream("/file/to/archive.tar");
const tarball = await Tarball.load(stream);
TypeScript

Or from an HTTP response:

// From the response of fetch
const res = await fetch("https://example.com/file/to/archive.tar");

const tarball = await Tarball.load(res.body!);
TypeScript

We can also load a compressed .tar.gz file:

const res = await fetch("https://example.com/file/to/archive.tar.gz");

const tarball = await Tarball.load(res.body!, { gzip: true });
TypeScript

Add new files to the tarball

Now that we have a Tarball instance, we can add new files to it, even when the instance is loaded from an existing tar file with files in it already.

const file = new File(["Hello, World!"], "hello.txt", { type: "text/plain" });

tarball.append(file);
TypeScript

We can also add directories or files with a directory path into the tarball. And if we put a file with a directory path when the directory doesn’t exist in the tarball, it will be automatically created for us. For example, the following code will put the file foo/bar.txt into the tarball, and create a foo directory automatically before putting the file.

const file = new Blob(["This is some content"], { type: "text/plain" });

tarball.append(file, { relativePath: "foo/bar.txt" });
// Now the tarball will have both the `foo` directory and the `bar.txt` file in the directory.
TypeScript

Retrieve files from the tarball

We can also retrieve the files that have been saved to the tarball previously, for example:

import { readAsText } from "@ayonli/jsext/reader";

const entry = tarball.retrieve("hello.txt")!;
const content = await readAsText(entry.stream);

console.log(content); // Hello, World!
TypeScript

List all entries in the tarball

There are two ways to show the entries in the tarball, one is to iterate over the tarball itself, and the other is to use the treeView method to create a tree view in which we can see the entries and their sub-entries in a hierarchical order.

Iterate the tarball

for (const entry of tarball) {
    if (entry.kind === "directory")
        console.log(`Directory: ${entry.name}; Path: ${entry.relativePath}`);
    } else {
        console.log(`File: ${entry.name}; Path: ${entry.relativePath}; Size: ${entry.size}; Last-Modified: ${entry.mtime}`);
    }
}
TypeScript

Show the tarball contents with a tree view

const tree = tarball.treeView();
console.log(tree);
TypeScript

Save the tarball to a file

Similar to the initiation process, the Tarball API doesn’t provide a way to save the file that is bound to the file system. Instead, it provides a stream method that returns a ReadableStream, which we can use to pipe to a file system file:

// Save the tarball to the file system (even the browser's OPFS)
import { createWritableStream } from "@ayonli/jsext/fs";

const output = createWritableStream("/file/to/archive.tar");

await tarball.stream().pipeTo(output);
TypeScript

Or we can upload the tarball to a URL:

const res = await fetch("https://example.com/path/to/archive.tar", {
    method: "PUT",
    body: tarball.stream(),
    headers: {
        "Content-Type": "application/x-tar",
    },
});
TypeScript

Also, we can compress the tarball before saving it:

const res = await fetch("https://example.com/path/to/archive.tar.gz", {
    method: "PUT",
    body: tarball.stream({ gzip: true }),
    headers: {
        "Content-Type": "application/gzip",
    },
});
TypeScript

At the end

The Tarball API also has other methods, such as remove to remove entries from the tarball, and replace to replace entries in the tarball.

Beside the Tarball class, there is also a tar function and an untar function, which simplifies the process of processing .tar files in the file system, they mirror the behavior of the tar -c and tar -x commands in Unix/Linux systems, which could be even more useful in specific scenarios.

If you’re interested, visit the GitHub page of the JsExt library, this library aims to be an extension to the JavaScript language, it’s written in modern Web standards and works in almost all runtimes. Who knows, maybe some of its modules are just for your special needs.

Some of its outstanding features are:

  • Various useful functions for built-in data types that are not built-in.
  • Various utility functions to extend the ability of flow control.
  • Multi-threaded JavaScript with parallel threads.
  • File system APIs for both server and browser environments.
  • Open dialogs in both CLI and web applications.
  • Manipulate file system paths and URLs in the same way.
  • Handle byte arrays and readable streams effortlessly.
  • Create, extract and preview archives in all runtimes.
  • And many more…

Leave a comment