In my previous article <<Manage file in the browser in JS>>, I demonstrate that we can access the file system through JS in the browser, also there is an Origin Private File System in the browser itself. Today, I’m going to explore the possibility of JavaScript further, showing that JavaScript can process tarball files by itself, and we don’t need a server runtime to do so.
But of course, I won’t start from the very bottom, because processing a tarball file is a lot of work under the hood. I’ve created an abstraction of the Tarball API in the JsExt library, which we can use to manipulate Tarball files with ease.
The Tarball
API is designed to be generic, it’s not a tool to archive files to a .tar
file, or to extract files from a .tar
file. Instead, a Tarball
instance represents a tarball archive itself, we can add new files to it, remove files from it, and preview its entries, very similar to the Archive Manager application in many Linux distros.
The Tarball
API doesn’t rely on any server-runtime-specific APIs, it only relies on modern Web APIs, mostly ReadableStream
, which are available in all modern browsers and server runtimes such as Node.js, Deno, and Bun, as well as edge runtimes such as Cloudflare Workers. So it can be used in any JavaScript environment.
Import the Tarball API
To import the Tarball
API, we can use the following methods, depending on our application setups.
// In Node.js, Deno (JSR), Bun, Cloudflare Workers, Browsers (with bundler or import map)
import { Tarball } from "@ayonli/jsext/archive";
// Or in Deno with URL import
import { Tarball } from "https://ayonli.github.io/jsext/archive.ts";
// Or in the browser with URL import
import { Tarball } from "https://ayonli.github.io/jsext/esm/archive.js";
TypeScriptCreate a tarball instance
There are two ways to create a Tarball
instance, one is to use the new
keyword to initiate an empty tarball, and the other is to load a .tar
file with the static Tarball.load
method.
Initiate an empty tarball
const tarball = new Tarball();
TypeScriptLoad a tar file
The Tarball.load
method accepts a readable stream instead of a file path or URL, meaning that it can load the tar file from anywhere. For example, from the file system:
// Load from the file system (even the browser's OPFS)
import { createReadableStream } from "@ayonli/jsext/fs";
const input = createReadableStream("/file/to/archive.tar");
const tarball = await Tarball.load(stream);
TypeScriptOr from an HTTP response:
// From the response of fetch
const res = await fetch("https://example.com/file/to/archive.tar");
const tarball = await Tarball.load(res.body!);
TypeScriptWe can also load a compressed .tar.gz
file:
const res = await fetch("https://example.com/file/to/archive.tar.gz");
const tarball = await Tarball.load(res.body!, { gzip: true });
TypeScriptAdd new files to the tarball
Now that we have a Tarball
instance, we can add new files to it, even when the instance is loaded from an existing tar file with files in it already.
const file = new File(["Hello, World!"], "hello.txt", { type: "text/plain" });
tarball.append(file);
TypeScriptWe can also add directories or files with a directory path into the tarball. And if we put a file with a directory path when the directory doesn’t exist in the tarball, it will be automatically created for us. For example, the following code will put the file foo/bar.txt
into the tarball, and create a foo
directory automatically before putting the file.
const file = new Blob(["This is some content"], { type: "text/plain" });
tarball.append(file, { relativePath: "foo/bar.txt" });
// Now the tarball will have both the `foo` directory and the `bar.txt` file in the directory.
TypeScriptRetrieve files from the tarball
We can also retrieve the files that have been saved to the tarball previously, for example:
import { readAsText } from "@ayonli/jsext/reader";
const entry = tarball.retrieve("hello.txt")!;
const content = await readAsText(entry.stream);
console.log(content); // Hello, World!
TypeScriptList all entries in the tarball
There are two ways to show the entries in the tarball, one is to iterate over the tarball itself, and the other is to use the treeView
method to create a tree view in which we can see the entries and their sub-entries in a hierarchical order.
Iterate the tarball
for (const entry of tarball) {
if (entry.kind === "directory")
console.log(`Directory: ${entry.name}; Path: ${entry.relativePath}`);
} else {
console.log(`File: ${entry.name}; Path: ${entry.relativePath}; Size: ${entry.size}; Last-Modified: ${entry.mtime}`);
}
}
TypeScriptShow the tarball contents with a tree view
const tree = tarball.treeView();
console.log(tree);
TypeScriptSave the tarball to a file
Similar to the initiation process, the Tarball
API doesn’t provide a way to save the file that is bound to the file system. Instead, it provides a stream
method that returns a ReadableStream
, which we can use to pipe to a file system file:
// Save the tarball to the file system (even the browser's OPFS)
import { createWritableStream } from "@ayonli/jsext/fs";
const output = createWritableStream("/file/to/archive.tar");
await tarball.stream().pipeTo(output);
TypeScriptOr we can upload the tarball to a URL:
const res = await fetch("https://example.com/path/to/archive.tar", {
method: "PUT",
body: tarball.stream(),
headers: {
"Content-Type": "application/x-tar",
},
});
TypeScriptAlso, we can compress the tarball before saving it:
const res = await fetch("https://example.com/path/to/archive.tar.gz", {
method: "PUT",
body: tarball.stream({ gzip: true }),
headers: {
"Content-Type": "application/gzip",
},
});
TypeScriptAt the end
The Tarball
API also has other methods, such as remove
to remove entries from the tarball, and replace
to replace entries in the tarball.
Beside the Tarball
class, there is also a tar
function and an untar
function, which simplifies the process of processing .tar
files in the file system, they mirror the behavior of the tar -c
and tar -x
commands in Unix/Linux systems, which could be even more useful in specific scenarios.
If you’re interested, visit the GitHub page of the JsExt library, this library aims to be an extension to the JavaScript language, it’s written in modern Web standards and works in almost all runtimes. Who knows, maybe some of its modules are just for your special needs.
Some of its outstanding features are:
- Various useful functions for built-in data types that are not built-in.
- Various utility functions to extend the ability of flow control.
- Multi-threaded JavaScript with parallel threads.
- File system APIs for both server and browser environments.
- Open dialogs in both CLI and web applications.
- Manipulate file system paths and URLs in the same way.
- Handle byte arrays and readable streams effortlessly.
- Create, extract and preview archives in all runtimes.
- And many more…