Hacker News new | past | comments | ask | show | jobs | submit
I keep telling folks that they need to imagine LLMs (even "local" ones) as if you're farming it out to JS code running on some dude's browser somewhere: It can't keep a secret, and a determined person can make it emit anything they like.

We need to be asking what the most devious and malicious output could be, and whether what we do with that output (e.g. arguments to command-line tools) would still be safe.

From my perspective, everyone is doing it. Security through obscurity - obviously if you’re harboring credit card numbers of users personal details, maybe take heed. But, if you’re a regular… run of the mill CRUD application, every other company is ALSO throwing caution to the wind. When hundreds of thousands of credentials are leaked into the funnel, does it really matter?

I’m at a small company, and I try to push for security as much as I can, but the stakeholders truly do not care. They want to move fast. It’s just part of the new world I guess. If we get hit by attackers? I don’t know what happens. Sorry, we told you not to - you wanted to move quick and break stuff, this is how that culminates.

I’m sure I’m not the only one.

The answer to that question seems obvious: No, it is not safe.

Yet with tens of millions of developers using these tools, there have not been widespread incidents of this sort as far as I know.

So it leaves me with a few choices:

- manually review and approve each command: obviously not realistic, you would just click Approve

- use a sandbox and hope the exploit is not devious enough to escape the sandbox when you run or open the project outside of the sandbox

- use AI without web access and limit other external dependencies

- don't use agentic AI

- use Claude or Codex auto approval classifier and hope for the best

Personally, I'm going with the last option for now.

We do have ways to avoid giving an LLM any secrets, but it needs to be the simple, default solution.