We need to be asking what the most devious and malicious output could be, and whether what we do with that output (e.g. arguments to command-line tools) would still be safe.
I’m at a small company, and I try to push for security as much as I can, but the stakeholders truly do not care. They want to move fast. It’s just part of the new world I guess. If we get hit by attackers? I don’t know what happens. Sorry, we told you not to - you wanted to move quick and break stuff, this is how that culminates.
I’m sure I’m not the only one.
Yet with tens of millions of developers using these tools, there have not been widespread incidents of this sort as far as I know.
So it leaves me with a few choices:
- manually review and approve each command: obviously not realistic, you would just click Approve
- use a sandbox and hope the exploit is not devious enough to escape the sandbox when you run or open the project outside of the sandbox
- use AI without web access and limit other external dependencies
- don't use agentic AI
- use Claude or Codex auto approval classifier and hope for the best
Personally, I'm going with the last option for now.