Let's talk about AI and end-to-end encryption
https://blog.cryptographyengineering.com/2025/01/17/lets-talk-about-ai-and-end-to-end-encryption/No need for presumption here: OpenAI is quite transparent about the fact that they retain data for 30 days and have employees and third-party contractors look at it.
https://platform.openai.com/docs/models/how-we-use-your-data
> To help identify abuse, API data may be retained for up to 30 days, after which it will be deleted (unless otherwise required by law).
https://openai.com/enterprise-privacy/
> Our access to API business data stored on our systems is limited to (1) authorized employees that require access for engineering support, investigating potential platform abuse, and legal compliance and (2) specialized third-party contractors who are bound by confidentiality and security obligations, solely to review for abuse and misuse.
I think Apple recently changed their stance on this. Now, they say that "source code for certain security-critical PCC components are available under a limited-use license." Of course, would have loved it if the whole thing was open source. ;)
https://github.com/apple/security-pcc/
> The goal of this system is to make it hard for both attackers and Apple employees to exfiltrate data from these devices.
I think Apple is claiming more than that. They are saying 1/ they don't keep any user data (data only gets processed during inference), 2/ no privileged runtime access, so their support engineers can't see user data, and 3/ they make binaries and parts of the source code available to security researchers to validate 1/ and 2/.
You can find Apple PCC's five requirements here: https://security.apple.com/documentation/private-cloud-compu...
Disclosure: I'm not affiliated with Apple. I read through their PCC security guide to see what an equivalent solution would look like in open source. If anyone is interested in this topic, please hit me up at ozgun @ ubicloud . com.
Yes. I made that point a few weeks ago. The legal concept of principal and agent applies.
Running all content through an AI in the cloud to check for crimethink[1] is becoming a reality. Currently proposed:
- "Child Sexual Abuse Material", which is a growing category that now includes AI-generated images in the US and may soon extend to Japanese animation.
- Threats against important individuals. This may be extended to include what used to be considered political speech in the US.
- Threats against the government. Already illegal in many countries. Bear in mind that Trump likes to accuse people of "treason" for things other than making war against the United States.
- "Grooming" of minors, which is vague enough to cover most interactions.
- Discussing drugs, sex, guns, gay activity, etc. Variously prohibited in some countries.
- Organizing protests or labor unions. Prohibited in China and already searched for.
Note that talking around the issue or jargon won't evade censorship. LLMs can deal with that. Run some ebonics or leetspeak through an LLM and ask it to translate it to standard English. Translation will succeed. The LLM has probably seen more of that dialect than most people.
"If you want a vision of the future, imagine a boot stepping on a face, forever" - Orwell
> This future worries me because it doesn’t really matter what technical choices we make around privacy. It does not matter if your model is running locally, or if it uses trusted cloud hardware — once a sufficiently-powerful general-purpose agent has been deployed on your phone, the only question that remains is who is given access to talk to it. Will it be only you? Or will we prioritize the government’s interest in monitoring its citizens over various fuddy-duddy notions of individual privacy.
I do think there are interesting policy questions there. I mean it could hypothetically be mandated that the government must be given access to the agent (in the sense that we and these companies exist in jurisdictions that can pass arbitrary laws; let’s skip the boring and locale specific discussion of whether you think your local government would pass such a law).
But, on a technical level—it seems like it ought to be possible to run an agent locally, on a system with full disk encryption, and not allow anyone who doesn’t have access to the system to talk with it, right? So on a technical level I don’t see how this is any different from where we were previously. I mean you could also run a bunch of regex’s from the 80’s to find whether or not somebody has, whatever, communist pamphlets on their computers.
There’s always been a question of whether the government should be able to demand access to your computer. I guess it is good to keep in mind that if they are demanding access to an AI agent that ran on your computer, they are basically asking for a lossy record of your entire hard drive.
But while not revealing user input, it would still reveal the outputs of the model to the company. And yeah, as the article mentions, unfortunately this kind of thing (MPC or fully-homomorphic encryption) probably won't be feasible for the most powerful ML models.
An app I'm building uses LLMs to process messages. I don’t want the unencrypted message to hit my server - and ideally I wouldn’t have the ability to decrypt it. But I can’t communicate directly from client -> LLM Service without leaking the API key.
I wrote about Apple's Private Cloud Compute last year; for the foreseeable future, I still think server-side Confidential Computing is the most practical way to do processing without huge privacy risks: https://www.anjuna.io/blog/apple-is-using-secure-enclaves-to...
I see "AI" tools being used even more in the future to permanently tie people to monthly recurring billing services for things like icloud, microsoft's personal grade of office365, google workspace, etc. You'll pay $15 a month forever, and the amount of your data and dependency on the cloud based provider will mean that you have no viable path to ever stop paying it without significant disruption to your life.
IMHO, Apple's PCC is a step in the right direction in terms of general AI privacy nightmares where they are at today. It's not a perfect system, since it's not fully transparent and auditable, and I do not like their new opt-out photo scanning feature running on PCC, but there really is a lot to be inspired by it.
My startup is going down this path ourselves, building on top of AWS Nitro and Nvidia Confidential Compute to provide end to end encryption from the AI user to the model running on the enclave side of an H100. It's not very widely known that you can do this with H100s but I really want to see this more in the next few years.