Sqlite3 WebAssembly

Something that would be really fun would be to run SQLite in-memory in a browser but use the same tricks as Litestream and Cloudflare Durable Objects (https://simonwillison.net/2024/Oct/13/zero-latency-sqlite-st...) to stream a copy of the WAL log to a server (maybe over a WebSocket, though intermittent fetch() POST would work too).

Then on subsequent visits use that server-side data to rehydrate the client-side database.

From https://sqlite.org/forum/info/50a4bfdb294333eec1ba4749661934... is looks like WAL mode is excluded from the default SQLite WASM build so you would have to go custom with that.

There are many layers of that's not how it works at play here.

In-memory SQLite databases don't use WAL. Wasm (and browser Wasm, in particular) doesn't support anything like the shared memory APIs SQLite wants for its WAL mode.

Litestream requires a very precise WAL setup to work (which just so happens to work with the default native SQLite setup, but is hard to replicate with Wasm).

Cloudflare Durable Objects may have been inspired by Litestream but works very differently (as do LiteFS, Turso, etc…)

The general idea of streaming changes from SQLite would work, but it's a lot of work, and the concurrency model of in-browser Wasm will make it challenging to implement.

(I wrote that forum post some time ago, and have WAL working in a server side Wasm build of SQLite, but none of the options to make it work would make much sense, or be possible, in browser)

As someone who uses sqlite fairly regularly, but doesn't understand what most of those paragraphs mean, do you have any recommendations for learning resources?

I'm gathering that I need to learn about: - WAL - Shared Memory APIs - Concurrency models - Durable Objects?

WAL: Write ahead log, common strategy for DBs (sqlite, postgres, etc.) to improve commit performance. Instead of fsync()ing every change, you just fsync() a log file that contains all the changes and then you can fsync() the actual changes at your leisure

Shared memory API: If you want to share (mutable) data between multiple processes, you need some kind of procedure in place to manage that. How do you get a reference to the data to multiple processes, how do you make sure they don't trample each other's writes, etc.

Concurrency model: There are many different ways you can formalize concurrent processes and the way they interact (message passing, locking, memory ordering semantics, etc.). Different platforms will expose different concurrency primitives that may not work the same way as other platforms and may require different reasoning or code structure

Durable objects - I think this is some Cloudflare service where they host data that can be read or modified by your users

This is all from memory, but IME, GPT is pretty good for asking about concepts at this level of abstraction

Thank you!

And side note on your last point - I've been burned too many times by confident hallucinations to trust my foundational learning to GPT. I hope someday that will improve, but for now ChatGPT is as trustworthy as an evening chat with someone at the bar.

... Someone who has been drinking since happy hour.

Slight point of confusion: that page says:

> These components were initially released for public beta with version 3.40 and will tentatively be made API-stable with the 3.41 release, pending community feedback.

But the most recent release of SQLite is 3.46.1 (from 2024-08-13)

Presumably they are now "API-stable" but the page hasn't been updated yet.

It would be great if the SQLite team published an official npm package bundling the WASM version, could be a neat distribution mechanism for them.

My favourite version of SQLite-in-WASM remains the Pyodide variant, which has been around since long before the official SQLite implementation. If you use Pyodide you get a WASM SQLite for free as part of the Python standard library - I use that for https://lite.datasette.io/ and you can also try it out on https://pyodide.org/en/stable/console.html

    import sqlite3
        'select sqlite_version()'
That returns 3.39.0 from 2022-06-25 so Pyodide could do with a version bump. Looks like it inherits that version from emscripten: https://github.com/emscripten-core/emscripten/blob/main/tool...
sqlite-wasm loads much faster than Pyodide, so if you don't need Python, then the former is a better choice.


    npm install @sqlite.org/sqlite-wasm
I built my own little demo page here: https://tools.simonwillison.net/sqlite-wasm

With the help of Claude, though it incorrectly hallucinated some of the details despite me pasting in documentation: https://gist.github.com/simonw/677c3794051c4dfeac94e514a8e5b...

So after downloading from the official downloads page and stripping away all the mjs files and "bundler-friendly" files, a minimal sqlite wasm dependency will be about 1.3MB.

For an in-browser app, that seems a bit much but of course wasm runs in other places these days where it might make more sense.

It's a good consideration, together with the fact browsers already have IndexedDB embedded. One use case still for in-browser apps like Figma / Photoshop-like / ML apps, where the application code and data is very big anyway, 1.3Mb may not add that much

Also worth considering parsing of wasm is significantly faster than JS (unfortunately couldn't find the source for this claim, there is at lease one great article on the topic)


The thing to keep in mind is that the WebAssembly sandbox model means that in theory the program (SqlLite in this case) can run wherever it makes sense to run it. That might mean running it locally or it might mean running on a central server or it might mean running nearby on the “edge”.
i’ve been looking for a Tanstack Query style library that is backed by Sqlite (backed by OPFS or some other browser storage) and syncs with an API in the background. Does anything like that exist? i’ve seen ElectricSQL and other sync engines but they are a bit opinionated. I’m pretty new to local-first but i feel like the developer ergonomics are not quite there yet

Meanwhile for “local-only” it would be great to use sqlite in the browser + native file system API so that the db could be stored on the user’s file system and we wouldn’t have to worry about browser storage eviction. i think that could really open up a whole world of privacy preserving offline software delivered through the browser

ElectricSQL and friends seem to be the best option so far, but they all come with a lot of caveats. It feels like local-first is near, and it's so tantalizing, but I haven't seen anything that feels like it's done enough to build on just yet.
After years of being able to run SQLite on my mobile phone, my tv, my router and gaming consoles, I can finally run it on my browser. Which also happens to be running on the most powerful machine I own
How long until we see WebAssembly/WebGPU become a platform independent choice for deploying server side code as well?
as soon as wasi is settled

There's a number of WASM platforms/tools: Wasmer, wasmCloud, a few others that escape my memory.

I like it
And we loved you in Lethal Weapon.
We like that you like it