Having a separate isolate in each threads spawned with the worker threads with a minimal footprint of 10MB does not seem like a high price to pay. It's not like you're going to spawn hundreds of them anyway is it? You will very likely spawn less or as much threads as your CPU cores can handle concurrently. You typically don't run a hundred of threads (OS threads) you use a thread pool and you cap the concurrency by setting a limit of maximum threads to spawn.
This is also how goroutines work under the hood, they are "green threads", an abstraction that operate on top of a much small OS thread pool.
Worker threads have constraints but most of them are intentional, and in many cases desirable.
I’d also add that SharedArrayBuffer doesn’t limit you to “shared counters or coordination primitives”. It’s just raw memory, you could store structured data in it using your own memory layout. There are libraries out there that implement higher-level data structures this way already
They’re heavy, they don’t share the entire process memory space (ie can’t reference functions), and I believe their imports are separate from each other (ie reparsed for each worker into its own memory space).
In many ways they’re closer to subprocesses in other languages, with limited shared memory.
It’s not “clean” to spin up thousands of threads, but it does work and sometimes it’s easier to write and reason about than a whole pipeline of distributing work to worked threads. I probably wouldn’t do it in a server, but in a CLI I would totally do something like spawn a thread for each file a user wants to analyze and let the OS do task scheduling for me. If they give me a thousand files, they get a thousand threads. That overhead is pretty minimal with OS threads (on Linux, Windows is a different beast).
However, apart from atomic "plain" memory no objects are directly shared (For Node/V8 they live in so called Isolated iirc) so from a logical standpoint they're kinda like a process.
The underlying reason is that in JavaScript objects are by default open to modification, ie:
const t = {x:1,y:2};
t.z = 3;
console.log(t); // => { x: 1, y: 2, z: 3 }
To get sane performance out of JS there are a ton of tricks the runtime does under the hood, the bad news is that those are all either slow (think Python GIL) or heavily exploitable in a multithreaded scenario.If you've done multithreaded C/C++ work and touched upon Erlang the JS Worker design is the logical conclusion, message passing works for small packets (work orders, structured cloning) whilst large data-shipping can be problematic with cloning.
This is why SharedArrayBuffer:s allows for no-copy sharing since the plain memory arrays they expose don't offer any security surprises in terms of code execution (spectre style attacks is another story) and also allows for work-subdivision if needed.
In terms of tradeoffs, if you’re coming from the single event loop model, they’re pretty consistent with the rest of JS. Isolation-first, explicit sharing, fewer footguns. So I think the tradeoffs are the right tradeoffs.
FWIW, traditional threads have their own tradeoffs (especially around IO). In JS that’s mostly a non-issue, so the "I need 1000s of threads" case just doesn’t come up very often.
A worker thread or Web Worker runs in its own isolate, so it needs to initialise it by parsing and executing its entry point. I'm not quite sure whether that's something that already happens but you could imagine optimising this by caching or snapshotting the initial state of an isolate when multiple workers use the same entry point, so new workers can start faster.
That cannot be done with the original main thread isolate because usually the worker environment has both different capabilities than the main isolate and a different entry point.
If I have to handle 1000 files in a small CLI I would probably just use Node.js asynchronous IO in a single thread and let it handle platform specifics for me! You’ll get very good throughput without having to handle threads yourself.