Hacker News new | past | comments | ask | show | jobs | submit
It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.

> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.

They seemed to have changed the wording since you posted the comment, now specifying exactly 30 days with seemingly no exceptions.

These terms seem to be updated at-will, so I'll take that with a grain of salt however.

I'm not sure they can actually respect that 30 days absolute commitment. Let's say some internal tool flags a suspect conversation, it bubbles up and a human operator reads it and it looks like evidence of a crime. Then, that employee is legally bound in many jurisdictions to prevent the destruction of that piece of evidence.

It's one thing to commit to a "everything is deleted when you press delete" automatic policy. It's quite another to say "we'll keep some stuff for up to 30 days, look inside it for any malfeasance, then pinky promise we'll delete it".

It generally goes without saying that legal obligations must be met. Before this 30 day policy they already had to comply with subpoenas and government retention requests.

Same with CSAM policies for any cloud provider. Doesn’t matter what the retention policy says, if the law says otherwise, the law wins. And there is no obligation to spell out every law in every country that might change how data is handled.

They write "We will require 30-day retention for all traffic on Mythos-class model". For potentially criminal content, maybe it's not "we", but "the authorities" that require the retention?

... and now I wonder if "we require retention" leaves the door open to retention that is not required, but let's say convenient.

Yep. They changed the terms, which needs legal review in my org, but the Fable model was available immediately, so of COURSE people have to go and flock to it to see how much better it is. Amazing how easy it is to spend five figures on demand and have very little to show for it; meanwhile when I want to buy a piece of enterprise software for 40-50k/year I have to spend weeks or months building the case, providing justification for ROI etc.
loading story #48492228
From https://support.claude.com/en/articles/15425695-covered-mode..., emphasis mine:

> Prompts and model completions are retained for at least 30 days and then automatically deleted, unless they are subject to a safety investigation or we are legally required to maintain them.

They keep it as long as they want.

That's strange. Even in my hobby-toy app, I have a TOS that I bump whenever the terms meaningfully change, and in my app, it forces a re-acceptance of the new terms before using the app again.
You mean your terms don't just say "these terms may change at any time and your continued use of this site implies acceptance??"

/s

> continued use of this … implies acceptance

One of the biggest crimes in tech world

That's only in the summary, farther down it says

> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.

Where are you seeing that updated version?
How were they not already auditing access to customer data?
They were not keeping it beyond the timeframe necessary for the model to process it, so there wasn't access there to audit.
The “all human access” is doing work also. Most access will likely be from AI agents.
"Even if they usually won't" is generous. I think they usually will, that's the point.
Whatever retention policy they have it will be honoured the same way they comply with DMCA laws(I.e if we’ve got it it’s ours to train/use)
I cannot help wondering if the 'we won't train on your data' applies across the fence over there in pentagon land, where the classified contracts be. Yeah, of course they are not connected. Or..

Present user-llm activity is a goldmine of intel the agencies literally spent lives and billions on getting hardly close to, yet they elect to just let this one slip by..

Maybe. Really, I don't dispute it.

But why? It's what, or precisely what, they always dreamed of.

I don't know why you'd read literally the last 25 years of leaks from mass surveillance programs and think for one moment that they've just, gosh, overlooked the opportunities.
> We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases

This reads to me as they can use any model that is not a "Claude model", and as for human access to that other model there can be different less restrictive privacy protections. In other words, that anything goes.

Yes. Words don't mean much these days. Taking corporate doublespeak at face value seems very couragious to me.
We've already gone through ECHELON, USAPATRIOT, TIA, PRISM, etc.. Either learn from the pattern and and plan accordingly, or be one of the credulous rubes caught off guard in the next wave of leaks.
Half of my customers will drop them right away, and the other half, after I explain to them what this means.
It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)
> they're not training on the data

How would you know that? You can only know what they say they will do with the data.

Sure, some trust is required that they aren't breaking their own terms of service (which legally enforces that they won't train on your data), but the same is true of every company/service you deal with (AWS, Google, your CRM etc). Their entire business model depends on enterprises trusting them.
>some trust is required that they aren't breaking their own terms of service

Which companies do all the time...

But if you're going to take your distrust that far then the issue is that they have your data at all, not that they are telling you that they will retain it for 30 days.
Civilization is built on trust, otherwise you’ll need to rebuild all of it yourself. This isn’t very different.
Civilization is also built on cheating and taking advantage of naive trust. This isn’t very different.
If that were dominantly true nothing would function at all. You trust and rely on thousands of people and services every day.

As others have said, if you're this skeptical I don't see why you would have been using them before this retention increase.

If that is the question. Those customers anyway won't be using any LLM or cloud services in first place. If you are a jornalist investigating nations, stay away from everything.
If you don't trust them, then no policy is enough. Technically everything you send to the model could be stored by them. Personally I do worry about that especially as an average consumer not an enterprise, no one is looking out for us and we don't get any guarantees. But enterprises will get the right treatment because they would find out and sue Anthropic if they lied.
>If you don't trust them, then no policy is enough.

No policy is enough, period. There should be technical and legal solutions to it.

There should be legal ramifications if they don't do what they say, but the practical solution is "don't use it".
I mean, if we're assuming they're just willing to lie and violate their own TOS then how could you ever be comfortable with them regardless of this 30 day period (or really any online service)? This seems like a bit of a silly take.
Why would not they train on the data if the goal is to prepare a better supervisor mechanism I guess?
Yet
If the data was valuable seems like they would offer a lower cost tier where customers would allow training on their data
{"deleted":true,"id":48488454,"parent":48487479,"time":1781173359,"type":"comment"}
Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.
That's ... Tuesday in techbro land
You think companies would be ok with terms of service that allow potentially distributing their data and internal knowledge? It's an interesting question, though they tend to be more conservative than consumers
And 99% of their other customers wont care either way.
You must have very unrepresentative customers. What will they use?
No AI at all, like 5/6 of my customers
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
loading story #48491511
{"deleted":true,"id":48485995,"parent":48483654,"time":1781149799,"type":"comment"}
It’s even worse than that. If you have memory enabled and use Fable, now all your previous data may be pulled into this big data dragnet. How can Anthropic possibly think this is okay?
Because they think people are okay with it, or at the very least, don't care, or don't care to know.

Which, judging by how much people are using Fable, appears to be true.

An interesting way to rate limit access while also getting some data to analyze. They will lift this restriction later when they have more capacity
Well, it's okay for them.
Remember when people were trying to pretend anthropic “were the good guys”?
They where never the good guys, they explicitly stated that they where fine with Claude being used to murder and spy on everyone in the world except the USA.
So much for that Effective Altruism Amodei and SBF are part of
>How can Anthropic possibly think this is okay?

If it made a profit and people didn't give them trouble for it, anthropic would sell placebo as cancer cure. What they think "is okay" is what they can get away with.

On a personal level, everything Anthropic has done has resulted in a dump truck of money being emptied onto the driveways of its employees. Pavlovian conditioning is incredibly strong when reinforced with generational wealth.
however dont all these AI companies retain your non-training data indefinitely? Did I miss something where they suddenly gave you the option to opt-out of retaining your non-training data? I thought that was a big money grab of theirs.
Does anyone know about the jailbreaks and attacks they are referring to? These are done through model queries?
One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...
It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.
Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.
This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.
> but then again most people working at LLM companies are deeply antihuman to start with.

I agreed with you up til this point, but this isn’t true and isn’t called for, and doesn’t strengthen your otherwise good point, in fact it weakens your point to make statements like that. Most people who work at LLM companies, like most people who work at most companies, are making a living and have the same ethics and principles as anyone else. I don’t know where you work or live, but don’t forget the exact same logic and exact same hyperbole is being used to make the same claim about people in tech, and the same claim about Americans and Europeans.

Really? They can't get any other tech jobs? They have to work for AI companies? Give me a break
No it's totally called for. This is technology that is literally ruining, destroying, and killing lives. Especially in regards to how US companies are operating with this tech. It's a valid claim, "just following" orders has never been a valid excuse.

These people just care about chasing the bag rather than doing right by their fellow humans. In their mind clearly some humans are more equal than others.

edit: to reiterate, the people choosing to work at these companies care more about becoming millionaires and chasing generational wealth rather than maybe questioning if the machine they are building may be producing terrible outcomes. They can work at any company on this planet easily, stop running coverage for FAANG workers that have always shown disdain for their fellow humans, they choose to work at the misery death machines because they simply do not care about the destruction they have wrought about the world.

You can say that but Anthropic are literally the "good guys" that were disgusted by Altman and co, yet even they seem to have sold off their morality. Absolute money corrupts absolutely.
They are not the good guys and never where. They where fine with the Claude being used to plan the murder of people and spying on people as long as they where outside the USA. That is not something "good guys" do, thats what sellouts do. Everyone working at these companies, who where paid small fortunes to ignore any feelings they might have. Hopefully we get a modern version of the nuremberg trials when this madness in the USA is over and we the people will then judge everyone involved.
I absolutely would do differently. Their behavior in public is gross.
Sure, everyone can be on their high horse from the comfort of their arm chair.
Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.
Agreed. Distillation is as much of an attack as scraping is an attack ;)
> Anthropic alleges Minimax... were trained this way

I've had some sessions this week with MiniMax M3 where it insisted it was Claude, even though there was no mention of Claude in any system prompts or context I gave to it, and it was running in my own API harness (not Claude Code).

Though I also wouldn't be surprised if "I am claude" is just the new "I am Mozilla/5.0 AppleWebKit KHTML Like-Gecko Chrome Safari".

It's a fairly common name to begin with.
Why would you trust anything they say at face value?

When they literally just showed you they are being deceptive by sneaking in the weasel word “almost”?

Firstly, none of this post is the contract people are signing. So it's merely a summary.

Secondly, like all contracts I'm sure there will be exceptions for holding data longer than 30 days with reasonable cause, eg a legal hold.

This reply does not make sense.

I did not claim it was the literal contract people would sign?

I'm asking for information to understand. What about that says I trust what they say as face value?
After 30 days and before the heat-death of the universe?
I mean deleting the Universe also deletes the Data so that counts.
That's a fair point.
Even worse when you git push something Microsoft gets all your code!
Yes, that is your intended purpose of “git push”, it’s to save. And only if you use GitHub.

A better analogy here is probably “every time you use VS Code, the files you edit get sent to Microsoft”.

Some legitimate concerns:

• You have trade secrets. Previously; you can use services like Bedrock, etc, with signed contracts and significant reputations. Your contract is between AWS and you, and stays within your AWS security boundary.

• Security breaches. Remember when Anthropic accidentally published the source tree of Claude code? Or Meta’s recent AI recovery bot that didn’t check if the supplied recovery email was actually the email of the Instagram account? The best way to reduce your exposure is to minimise storage.

• Weaponised T&S. For example what if Anthropic decided to build a classifier for “usage in unsupported regions” that’s super overbearing (as we see with Fable) and vacuums up all context/input/output if there’s Mandarin? Contractually they could now retain it forever, not just 30 days, for ‘trust and safety purposes’ and perhaps have AI scan for any new or interesting ML techniques at scale, for Anthropic’s own use? They say just can’t train Claude models on the data.

All analogies are bad.
The only one doing a very bad analogy in the thread it was you. You got a response with a counter analogy just to play on your same field and then a deep answer with real scenarios. You should respond to those, if you want to continue the discussion.
Using language to represent reality is lossy
All models are wrong, but some are useful
Only if you push it to GitHub.
{"deleted":true,"id":48485055,"parent":48484058,"time":1781140375,"type":"comment"}
That is why, for the last five years I have been checking in with them, code with some of the most atrocious quality. So far...its working....
Thank you for your service.
Uhm, no?

I have NO single project on Github.

One of my clients has their project on GitHub.

Every other client I have ever worked with or for ran and runs their own gitforge.