Run NanoClaw in Docker Sandboxes
https://nanoclaw.dev/blog/nanoclaw-docker-sandboxes/The biggest one (as Karpathy notes) is having skills for how to write a (slack, discord, etc) integration, instead of shipping an implementation for each.
Call it “Claude native development” if you will, but “fork and customize” instead of batteries-included platforms/frameworks is going to be a big shift when it percolates through the ecosystem.
A bunch of things you need to figure out, eg how do you ship a spec for how to test and validate the thing, make it secure, etc.
How long before OSs start evolving in this way? You can imagine Auto research-like sharing and promotion upstream of good fixes/approaches, but a more heterogenous ecosystem could be more resistant to attacks if each instance had a strong immune system.
This threat model is concerned with running arbitrary code generated by or fetched by an AI agent on host machines which contain secrets, sensitive files, and/or exfoliate data, apps, and systems which should not be lost.
What about the threat model where an agent deletes your entire inbox? Or sends your calendar events to a server after prompt injection? Bank transfers of the wrong amount to the wrong address etc. all these are allowed under the sandboxing model.
We need fine grained permissions per-task or per-tool in addition to sandboxing. For example: "this request should only ever read my gmail and never write, delete, or move emails".
Sandboxes do not solve permission escalation or exfiltration threats.
It's also the first project I've used where Claude Code is the setup and configuration interface. It works really well, and it's fun to add new features on a whim.
Some update broke the OpenRouter integration and I haven't been able to fix the issue. I took a quick look at the code, hoping to narrow it down and it's pretty much exactly what you would expect, there's hidden configuration files everywhere and in general it's just a lot of code for what's effectively a for loop with Whatsapp integration (in my case :)).
Not to mention that their security model doesn't match my deployment (rootless and locked down Kubernetes container) so every Openclaw update seemed to introduce some "fix" for a security issue that broke something else to solve a problem I do not have in the first place :)
I've switched to https://github.com/nullclaw/nullclaw instead. Mostly because Zig seems very interesting so if I have to debug any issues with Nullclaw at least I'll be learning something new :)
On Linux, however, I absolutely don't want a hypervisor on my quite underpowered single-board server. Linux namespaces are enough for what I want from them (i.e. preventing one of these agent harnesses to hijack my memory, disk, or CPU). I wonder why neither OpenClaw nor NanoClaw seem to offer a sanely configured, prebuilt, and frequently updated Docker image?
It does not really matter.
IMHO, until you figure out useful ways to spend tokens to do useful tasks the runtime should be a second thought.
As far as security goes, running LLM in a container in just simply not enough. What matters is not what files it can edit on your machine but what information it can access. And the access in this case as far as these agents are concerned is basically everything. If this does not scare you you should not be thinking about containers.
I've been thinking about how docker support would work, so I'll check this out!
In other words, Claude is the compiler.
It's better if your app's description just tells me what it does in a direct way using plain language. It's fine to tell me it's an alternative to something, but that should be in addition to rather than instead of your own description.
This has been my setup since early this year, not even that much code: https://github.com/hofstadter-io/hof/tree/_next/lib/agent/se...
The bigger effort is making it play nice with vscode so you can browse and edit the files and diffs.