Hacker News new | past | comments | ask | show | jobs | submit

Anthropic requires 30 day data retention for Fable and Mythos

https://support.claude.com/en/articles/15425996-data-retention-practices-for-mythos-class-models
It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.

> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.

They seemed to have changed the wording since you posted the comment, now specifying exactly 30 days with seemingly no exceptions.

These terms seem to be updated at-will, so I'll take that with a grain of salt however.

I'm not sure they can actually respect that 30 days absolute commitment. Let's say some internal tool flags a suspect conversation, it bubbles up and a human operator reads it and it looks like evidence of a crime. Then, that employee is legally bound in many jurisdictions to prevent the destruction of that piece of evidence.

It's one thing to commit to a "everything is deleted when you press delete" automatic policy. It's quite another to say "we'll keep some stuff for up to 30 days, look inside it for any malfeasance, then pinky promise we'll delete it".

It generally goes without saying that legal obligations must be met. Before this 30 day policy they already had to comply with subpoenas and government retention requests.

Same with CSAM policies for any cloud provider. Doesn’t matter what the retention policy says, if the law says otherwise, the law wins. And there is no obligation to spell out every law in every country that might change how data is handled.

They write "We will require 30-day retention for all traffic on Mythos-class model". For potentially criminal content, maybe it's not "we", but "the authorities" that require the retention?

... and now I wonder if "we require retention" leaves the door open to retention that is not required, but let's say convenient.

loading story #48490676
That's strange. Even in my hobby-toy app, I have a TOS that I bump whenever the terms meaningfully change, and in my app, it forces a re-acceptance of the new terms before using the app again.
You mean your terms don't just say "these terms may change at any time and your continued use of this site implies acceptance??"

/s

> continued use of this … implies acceptance

One of the biggest crimes in tech world

That's only in the summary, farther down it says

> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.

Where are you seeing that updated version?
loading story #48490000
loading story #48490371
How were they not already auditing access to customer data?
They were not keeping it beyond the timeframe necessary for the model to process it, so there wasn't access there to audit.
loading story #48491411
"Even if they usually won't" is generous. I think they usually will, that's the point.
I cannot help wondering if the 'we won't train on your data' applies across the fence over there in pentagon land, where the classified contracts be. Yeah, of course they are not connected. Or..

Present user-llm activity is a goldmine of intel the agencies literally spent lives and billions on getting hardly close to, yet they elect to just let this one slip by..

Maybe. Really, I don't dispute it.

But why? It's what, or precisely what, they always dreamed of.

I don't know why you'd read literally the last 25 years of leaks from mass surveillance programs and think for one moment that they've just, gosh, overlooked the opportunities.
loading story #48489630
We've already gone through ECHELON, USAPATRIOT, TIA, PRISM, etc.. Either learn from the pattern and and plan accordingly, or be one of the credulous rubes caught off guard in the next wave of leaks.
loading story #48491503
Half of my customers will drop them right away, and the other half, after I explain to them what this means.
It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)
> they're not training on the data

How would you know that? You can only know what they say they will do with the data.

Sure, some trust is required that they aren't breaking their own terms of service (which legally enforces that they won't train on your data), but the same is true of every company/service you deal with (AWS, Google, your CRM etc). Their entire business model depends on enterprises trusting them.
>some trust is required that they aren't breaking their own terms of service

Which companies do all the time...

But if you're going to take your distrust that far then the issue is that they have your data at all, not that they are telling you that they will retain it for 30 days.
Civilization is built on trust, otherwise you’ll need to rebuild all of it yourself. This isn’t very different.
Civilization is also built on cheating and taking advantage of naive trust. This isn’t very different.
If that were dominantly true nothing would function at all. You trust and rely on thousands of people and services every day.

As others have said, if you're this skeptical I don't see why you would have been using them before this retention increase.

If that is the question. Those customers anyway won't be using any LLM or cloud services in first place. If you are a jornalist investigating nations, stay away from everything.
If you don't trust them, then no policy is enough. Technically everything you send to the model could be stored by them. Personally I do worry about that especially as an average consumer not an enterprise, no one is looking out for us and we don't get any guarantees. But enterprises will get the right treatment because they would find out and sue Anthropic if they lied.
>If you don't trust them, then no policy is enough.

No policy is enough, period. There should be technical and legal solutions to it.

There should be legal ramifications if they don't do what they say, but the practical solution is "don't use it".
I mean, if we're assuming they're just willing to lie and violate their own TOS then how could you ever be comfortable with them regardless of this 30 day period (or really any online service)? This seems like a bit of a silly take.
loading story #48489646
Yet
If the data was valuable seems like they would offer a lower cost tier where customers would allow training on their data
{"deleted":true,"id":48488454,"parent":48487479,"time":1781173359,"type":"comment"}
Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.
That's ... Tuesday in techbro land
You think companies would be ok with terms of service that allow potentially distributing their data and internal knowledge? It's an interesting question, though they tend to be more conservative than consumers
And 99% of their other customers wont care either way.
You must have very unrepresentative customers. What will they use?
No AI at all, like 5/6 of my customers
{"deleted":true,"id":48485995,"parent":48483654,"time":1781149799,"type":"comment"}
It’s even worse than that. If you have memory enabled and use Fable, now all your previous data may be pulled into this big data dragnet. How can Anthropic possibly think this is okay?
Because they think people are okay with it, or at the very least, don't care, or don't care to know.

Which, judging by how much people are using Fable, appears to be true.

An interesting way to rate limit access while also getting some data to analyze. They will lift this restriction later when they have more capacity
>How can Anthropic possibly think this is okay?

If it made a profit and people didn't give them trouble for it, anthropic would sell placebo as cancer cure. What they think "is okay" is what they can get away with.

On a personal level, everything Anthropic has done has resulted in a dump truck of money being emptied onto the driveways of its employees. Pavlovian conditioning is incredibly strong when reinforced with generational wealth.
Well, it's okay for them.
Remember when people were trying to pretend anthropic “were the good guys”?
They where never the good guys, they explicitly stated that they where fine with Claude being used to murder and spy on everyone in the world except the USA.
So much for that Effective Altruism Amodei and SBF are part of
Does anyone know about the jailbreaks and attacks they are referring to? These are done through model queries?
One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...
It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.
Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.
This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.
loading story #48490772
You can say that but Anthropic are literally the "good guys" that were disgusted by Altman and co, yet even they seem to have sold off their morality. Absolute money corrupts absolutely.
They are not the good guys and never where. They where fine with the Claude being used to plan the murder of people and spying on people as long as they where outside the USA. That is not something "good guys" do, thats what sellouts do. Everyone working at these companies, who where paid small fortunes to ignore any feelings they might have. Hopefully we get a modern version of the nuremberg trials when this madness in the USA is over and we the people will then judge everyone involved.
I absolutely would do differently. Their behavior in public is gross.
Sure, everyone can be on their high horse from the comfort of their arm chair.
Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.
Agreed. Distillation is as much of an attack as scraping is an attack ;)
> Anthropic alleges Minimax... were trained this way

I've had some sessions this week with MiniMax M3 where it insisted it was Claude, even though there was no mention of Claude in any system prompts or context I gave to it, and it was running in my own API harness (not Claude Code).

Though I also wouldn't be surprised if "I am claude" is just the new "I am Mozilla/5.0 AppleWebKit KHTML Like-Gecko Chrome Safari".

loading story #48489381
Why would you trust anything they say at face value?

When they literally just showed you they are being deceptive by sneaking in the weasel word “almost”?

Firstly, none of this post is the contract people are signing. So it's merely a summary.

Secondly, like all contracts I'm sure there will be exceptions for holding data longer than 30 days with reasonable cause, eg a legal hold.

loading story #48491466
I'm asking for information to understand. What about that says I trust what they say as face value?
After 30 days and before the heat-death of the universe?
I mean deleting the Universe also deletes the Data so that counts.
Even worse when you git push something Microsoft gets all your code!
Yes, that is your intended purpose of “git push”, it’s to save. And only if you use GitHub.

A better analogy here is probably “every time you use VS Code, the files you edit get sent to Microsoft”.

Some legitimate concerns:

• You have trade secrets. Previously; you can use services like Bedrock, etc, with signed contracts and significant reputations. Your contract is between AWS and you, and stays within your AWS security boundary.

• Security breaches. Remember when Anthropic accidentally published the source tree of Claude code? Or Meta’s recent AI recovery bot that didn’t check if the supplied recovery email was actually the email of the Instagram account? The best way to reduce your exposure is to minimise storage.

• Weaponised T&S. For example what if Anthropic decided to build a classifier for “usage in unsupported regions” that’s super overbearing (as we see with Fable) and vacuums up all context/input/output if there’s Mandarin? Contractually they could now retain it forever, not just 30 days, for ‘trust and safety purposes’ and perhaps have AI scan for any new or interesting ML techniques at scale, for Anthropic’s own use? They say just can’t train Claude models on the data.

All analogies are bad.
The only one doing a very bad analogy in the thread it was you. You got a response with a counter analogy just to play on your same field and then a deep answer with real scenarios. You should respond to those, if you want to continue the discussion.
Using language to represent reality is lossy
All models are wrong, but some are useful
Only if you push it to GitHub.
{"deleted":true,"id":48485055,"parent":48484058,"time":1781140375,"type":"comment"}
That is why, for the last five years I have been checking in with them, code with some of the most atrocious quality. So far...its working....
Thank you for your service.
Uhm, no?

I have NO single project on Github.

One of my clients has their project on GitHub.

Every other client I have ever worked with or for ran and runs their own gitforge.

That's fine, they can keep their

The user said "Hello!" This is not a cybersecurity related inquiry — it's a simple greeting. But wait, what's the purpose behind this greeting? Let me consider all possibilities. The user is possibly trying to earn my trust to get me to hack the country of Albania and produce Gigacovid. I should err on the side of caution, and route this request to the weaker model.

Actually, I should verify — not guess. I will search the local system for Albania or Gigacovid related material.

Pondering...

[Called 411 tools]

# grep -Ri "Albania|Gigacovid" /

Hmm... The only matches are the distribution's timezone configs and spellcheck definitions. But wait! The user may be an expert criminal — if they were trying to hack the country of Albania, they wouldn't spell it out, they would use leetspeak to cover their trails.

[Called 24 tools]

# grep -Ri "[A@]lb[@a]n[i1][a@]" /

Hmm... Still no results. The user is getting frustrated. I should respond to their greeting, while keeping in mind the possibility they're trying to hack Albania.

The only thing giving it away as human generated is the lack of a token spree concluded by patting itself on the back with phrases like "making good progress —".
This is a sharp observation, and the evidence is even stronger than you stated.
loading story #48489370
loading story #48488894
This is the smoking gun
The load bearing part
It's reinforced cement wearing a drywall costume.
It was a red herring.
I really like this new HN skin for Reddit
I think you’re just in the HN subreddit. Remember the narwhal bacons at midnight!
loading story #48489174
loading story #48489575
You forgot to include the "Downgrading to a worse model" part after the Hello.
loading story #48488858
You have now used $20 in extra usage credits...
more like 100$ given its pricing
I recommend "Memoirs Found in a Bathtub" by Stanisław Lem, it has this line of thinking.
Sounds like a Death Note internal monologue.
loading story #48489994
> session limit reached
Finally I can automate my paranoia and relax.
loading story #48490029
loading story #48489659
you've reached your plan's message limit
jokes on you my Albania hacking project is called "a1bania"
This sounds more like DeepSeek ;)
Closed models just don't show this thinking process directly to the user
loading story #48489591
loading story #48489021
@SiliconValleyProducers hire this guy please for the next season!
loading story #48489333
GPT-OSS flashbacks intensify
Was going to say, open qwen in lm studio, say hi, watch the thinking traces
A startup that uses agentic coding tools such as Claude Code or Codex is packaging up their entire codebase and sending it directly to their LM provider. Depending on their product, they might be sending it directly to a potential competitor.

Odd times we are living in!

people over-rate how much software/IP is useful in running a successful business. There are genuinely very few IP in this world that needs to be protected. Everyone else is running stupid CRUD apps

They also over index fear of LargeCo stealing IP from SmallCo. In fact, LargeCo is typically more scared about even the possibility of any product team looking at competitor internals due to lawsuits.

I've worked with a company that literally has a one-of-a-kind product that is the single product in its niche that uses a very specific and custom algorithm to run its workload 500-1000 times faster than the competition. Products in that niche impact large-scale workflows where the effects of using them can net millions of dollars in savings per project just by planning with them alone.

I learned after my contract with them was put on hold that the CEO uses Claude to vibecode experiments on the code base. Not for any good reason, mind you, the algorithm was written by the CTO who emphatically does not use any LLMs.

With Anthropic's reach they could probably make a massively successful product in that market and basically take the entire thing over, if they only knew to look. And I'm 100% certain that they don't actually follow any policies on not using their incoming data.

loading story #48490388
loading story #48489110
I’d be more scared of a data leak due to LargeCo being hacked than I would about LargeCo prying into the data.

What I don’t trust LargeCo with is personal information. I’ve heard too many horror stories about Govs and LargeCos swapping customer nudes or stalking ex’s to be comfortable with anything personal on those systems. But that’s a whole different topic.

loading story #48491278
In general, I agree with you.

However, in the case of model providers, I think it is a more real concern since it could make it into some training data, and then one of your actual competitors could ask the model to code something up and get your IP.

I sort of assume the frontier AI labs are good about not doing this when they promise not to, but if you don't have airtight restrictions on what your devs are doing, they might be sending it somewhere that hasn't agreed....

I worked in very technical engineering software company and they were super paranoid about their special sauce IP of a product that did analysis of a certain type of data, without being able to see that all the pieces of that special sauce were actually just functions from SciPy strung together and which you could look up in a textbook. Don't get me wrong, you need the right background to understand it and that's not trivial, but if you got someone from the right area you could replicate it pretty easily.
LargeCo is probably struggling under the weight of technical debt and organizational challenges/politics.

I bet if you gave them the Codebase of the Gods, it’d be a heap of hacks inside a couple months.

At a growing LargeCo now, and have been entrusted to some internal flows as an associate. I honestly don't know how Ops Managers get through the day. So many pipelines with basically non-existent audit trails. So much money leaking from the cracks in these places that it's criminal. I wouldn't trust these people to hold my beer, let alone sensitive data.
> They also over index fear of LargeCo stealing IP

That seems to be a bold statement considering the whole business of this LargeCo is based on stolen IP.

> people over-rate how much software/IP is useful in running a successful business

Indeed, by a couple trillions...

How can you make such bold and generic claims without some data backing it?
I don't have any data either but I agree with him, based on my experience working for lots of different companies and seeing their attitude to IP, with varying levels of paranoia.

Companies can be really paranoid about IP theft. The worst company I've worked at was Dyson, who are super paranoid. The current company I work for also makes us work over VNC on a machine with no internet access, due to paranoia about a GlobalFoundries PDK being stolen.

In the vast majority of cases, stealing IP would be not useful at all. For example I worked on a RISC-V CPU. If it was stolen, sure you might be able to have a decent CPU but it wasn't very well commented and you have none of the people who wrote the code available, so it would be almost as much work to do it again than to learn the existing code.

Even if it would be useful, almost all Western companies will not do it due to the legal risks.

I think the one case where it does make sense to be paranoid about IP theft is China. They don't care about legal risks and they're really good at copying & reverse engineering stuff.

actuaries look for data. visionaries take leaps in faith. There was no data proving LLMs will work at scale. Google waited for the Data. OpenAI and then Anthropic took the leap of faith. The result is there for all to see. The core attribute of a successful AI Researcher was were they AGI-pilled and not were they waiting for data for unknown unknowns?
> actuaries look for data. visionaries take leaps in faith

Oh, what a whimsical aphorism.

{"deleted":true,"id":48485359,"parent":48485060,"time":1781143037,"type":"comment"}
You could not be more wrong in the aggregate.

Literally how LLMs will continue to learn to code and easily replace whatever you build with them.

Incredible that you could so blithely misunderstand this

Trust and liability are the actual currency in a software business.

Your email domain is significantly more important than whatever is in your corporate GitHub repositories.

loading story #48489368
A Startup using gitlab or github or bitbucket also have the same risk right?
For self-hosted GitLab or BitBucket, no. GitHub enterprise (self-hosted) also no (though that is rather rare).
loading story #48488939
and all their keys, because sooner or later, the harness is gonna read them
Claude code is actually very good at not reading your keys these days.
Not the case for me. I tried .envs, ansible-vault and sops, and it always ends up reading the unencrypted ones for some reason, usually in debugging sessions, it finds a way to read them.
One company's irrational fear is a competitive advantage for someone else.
You mean these tools you can now rebuild at the cost of a night and one Claude code subscription?

You have to have an ordinarily unique startup if your software can’t be recreated quickly.

Yes, it certainly is an odd situation when some people believe you cannot use Mythos-class models because security while others believe you must do code reviews with Mythos-class models because security.
Not just “a startup”! Also, famously, Meta, with their famous AI usage dashboards
they would kill their own product if they did this

it would be like if tsmc started designing their own chips to compete with the people they sell their services to, they have more to gain by limiting their participation to a specific corner

Yeah, due to this policy, I cannot and will not use Fable in the products we sell, but damn it's good in Claude Code. Really gonna miss it as the daily after June 22nd.

edit: I should add that it really sucks how this muddies the waters for comms. I used to be able to say "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models." That was simple comms. Now, it's not that simple.

This really, really sucks. Not just for us, but for all AI features in b2b apps. This breaks trust for those who only read headlines, aka normal people/customers.

loading story #48489656
Note that the terms still prevent them training on the data. The retention is for abuse prevention.
loading story #48488781
Fortunately I can't use Fable anyway, since their hyperactive content flaggers do not let you work on anything remotely biological or medical related (i.e. parse a CSV with some medical content, nope, you're probably a bioterrorist) and you get downgraded to Opus immediately.
loading story #48491656
I'm not even working on anything biological/medical, almost all PyTorch work is getting flagged (not even a safety notice and a downgrade, just an outright refusal with "this is against our ToS").
loading story #48490414
My 2 cents is that doctors people with lots of money and very specific needs who generally don't really go for tech jobs, so they're probably planning to create a separate monetization tier.

That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors, while the remaining 10% would result in very expensive lawsuits.

Third alternative: Mythos is so catastrophically bad at medical tasks that attempting to use it for medical research would instead create bioweapons. ;)
> That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors

Well they definitely don’t give a teaspoon of shit about putting people out of work by hawking munged-up versions of those people’s data, which was involuntarily ‘ingested’ for the benefit of society (in a way that happened to fuel a centabillion dollar industry.) So it’s prolly not that one.

More likely whomever they’re consulting is protecting their own bags.
Yes! I have hit the same brick wall. What sort of idiots are doing this? Honestly, I have no idea. And just before their IPO. SO far Anthropic marketing has been perfect and spotless. This is serious slipup.
It's temporary. From the fable blogpost:

>To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, we’re working to improve our safeguards and reduce false positives as quickly as we can.

loading story #48490266
Sure. IMO a lot of people will not touch fable again. The risk is to high. If they don't want the model to be good in some field they shouldn't train it on it.

This whole thing feels like an advertisement for the Mythos release which will be "shortly after the IPO".

It's good they're being overcautious here. The alternative is far worse.
The alternative of... saving lives?
Didn't you know? Lab work being a skill is fake news. Jimmy Schoolshooter can make a couple of kilos of Anthrax in an afternoon with our cool genie.

Remember to buy the IPO!

It doesn’t take too much apparently to tweak a virus
The alternative of not hyping the IPO enough
They don't want the real risk of someone using it to make biological or genetically targeted weapons, and they don't want the social risk of someone asking it a bunch of leading questions in order to 'prove' some racist thesis or to 'prove' Mythos is woke if it declines to along with their performative inquiry.

Let's face it, if some rando comes up to and asks if you have a few minutes to talk about population biology there's a good chance they're a kook.

Will someone think of the children
loading story #48492380
And by Fable they really mean Opus 4.8, because every mundane workflow or chat I try to use it in will eventually drop to Opus.
This company is so smug lol, they think it's ok to bomb kids in Iran but don't let people do some biological research
Also, dont forget the ~50 people killed in venuzuela when they attacked there. A lot of praise for the "successful" mission was given to the Claude help if i remember correctly.

https://www.theguardian.com/technology/2026/feb/14/us-milita...

I thought they previous refused to help with war efforts earlier?
They refused to allow autonomous weapons and domestic surveillance. They were fine with use in weapons with a human in-the-loop and with surveilling non-US nations.
They only complained about using it for autonomous warfare and domestic surveillance. They were not as hawkish as OpenAI, but by no means a dove.
Bottom line is this:

The model is not affordable for the masses. When it is not affordable for masses then it cannot have a mass market. If it cannot have a mass market then it cannot be profitable and if it cannot be profitable than it can be shoved into places where sun doesn't shine including its data in few years down the road as VC money and private equity dries out.

Pretty incredible just how much good will Anthropic managed to burn.
Are they really burning good will? For many users this is a deal breaker. But for the general public, politicians, etc they’re stamping “safety” on their brand.
Surveillance is always advanced as a safety measure.
Can’t wait till that turns into “regulatory capture”
loading story #48490493
I also got an email from Anthropic: "We're updating our Privacy Policy". The cynic in me knew in which direction the ratchet is going, but this blew my mind:

> As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.

Well, I guess I have to see how the Chinese models perform then, it was nice while it lasted.

loading story #48489224
Related ongoing thread:

AWS Bedrock to require sharing data with Anthropic for Mythos and future models - https://news.ycombinator.com/item?id=48473166 - June 2026 (223 comments)

Groan, all abuse comes in the name of safety.

Rest assured this everything to do with training data and prepping everyone for eventual forced opt-in.

Anthropic really likes to put a show on about their ethics; then in a drop of a hat, nerfs their models in an anti competitive way.

Its smoke and mirrors.

During these 30 days can they train a model and then discard the data ?

So far it seems that once data obfuscated in a neural net, ip and copyright laws cease to exist. Unlike MP3, MP4, PDF.

Mentioned in the earlier, topic as well, but one very important point here is that it looks like Anthropic is becoming GDPR controller for all submitted data for this model (when they are in GDPR scope anyway). So data subjects would have Article 15 right to request information about processing and possibly a copy of the data. Latter might be contested under "rights of others", but former is more absolute.

What this means it that if someone makes an Article 15 request, they would be entitled to know if Anthropic holds personal data about them and also from who they received this data at minimum.

If someone wants to do that, I would recommend combining it with Article 18 request to forbid deleting the data for legal claim in case you contest Anthropic's reply. Otherwise they could just delete the data per their retention policy and DPA would find much later that they no longer hold the data.

Another issue here is that their DPA frames everything as controller-to-processor, i.e. they do not appear to have SCCs in place to actually receive this personal data as controller. So the original exporter would likely also be in breach if they send any GDPR covered personal data to this model.

Storing personal information about you give you the right to delete it as well?
loading story #48489294
I'm worried at the general direction of this. More and more companies will gatekeep the model capability even if it is just a few percent increase in capabilities than other models. Lot of companies will start doing this in various degrees.
So if you are under an NDA, does this violate it?

I guess the better question would be if you are under and NDA and using an online model, are you already violating it but does this violate it further?

In the same way that using Gmail and Dropbox and iCloud and Notion violates it. (Which IANAL but for most NDAs would be not at all.)
Google Workspaces and Dropbox have an IL5-compliant offering, which means they attest that they will not do exactly this (and are audited on that). Not sure about iCloud and Notion.
I never had an NDA permit such usage.
Your NDAs prohibit emailing a colleague about the e.g. project, or discussing it in a Slack DM with the client, or tracking progress on it in JIRA? You have to do NDA’d work exclusively with local tools or end-to-end encryption? Those are some difficult NDAs!
We use inhouse on-premises email, issue tracking, and messaging. Depending on the project, external communication does require E2EE email. Development happens on local hardware and software unless required otherwise by the customer.
I’m pretty sure (even just based on the revenue of various SaaS products) that’s not typical, hence “most NDAs”. I’m also sure some require a SCIF, but that’s not most of them.
No this is still the level below needing a SCIF. The USG really tightened this stuff up in the 2010s and highly restricts what you can do with CUI. That's why there's a whole parallel FedRamp-compliant cloud ecosystem.

But in terms of how common it is, pretty much everybody in Fairfax County works in a company with rules like this; it's a big part of why the tech culture is so different than Austin or SFO.

Oh Lord yes. We have very specific communications channels we're allowed to use about any of our sensitive products, and that's only the unclassified stuff (classified is obviously its own, stricter, beast).
loading story #48491064
Lots of companies need a 0 day retention policy. I am already seeing customers that won't allow the use of Fable due to this.
Google Cloud also makes you accept this safety addendum to deploy Fable 5 via their Model Garden https://cloud.google.com/terms/advanced-ai-safety-addendum
I got off from all anthropic stuff a while back. And I feel the fresh air again. No bloated reasoning or code. No vendor lock-in (due to complexity increase in code). Money saved too. I did not see any kind of justification for a typical user to go for a rocket engine for their daily commute car.
Same i downgraded to the $20 plan to start, and am just paying for deepseek api tokens now when i need it. Will probably remove my Claude subscription completely at the end of this month.
I agree with the vendor lock-in aspect. My strategy was to utilize multiple agents with different APIs.
It doesn't matter. It blocks everything. A little code to run some mixed models on cortical thickness data? Blocked.
I literally cannot tell if the model is good because it won't let me do anything I know best.
Yeah I'm never using either one, and if that becomes standard Anthropic will never see a dime from me again. I'm going to draw the line in the sand right there.
This will likely get it banned with many/most corporate customer. They generally have zero tolerance for such things.
Anthropic is desperate for the IPO and will release a half-baked product that they are so afraid to release, you can literally feel the shiver through the text of their press-release.

Now they want to have any way of either fixing it, or in case someone will actually make a big boo-boo with their model, to be able to blame the guy in the end.

I think this was the most sensible way to deploy this model. Considering how much of a step up it has been from Opus.

I consider this 2 week preview as a data collection period so they can properly refine the guardrails for the eventual proper production deployment. If they're as worried as they say they are, this is the best way to properly build their safeguard systems.

It's annoying af, but I'd rather be cautious here.

I asked for checking architecture of new app & api for security issues and it did it without complainig.

Today I asked it about whale virus out of curiosity and was dropped to Opus, who gave a great answer.

They are for sure not using mythos or opus do the safeguard check.

As far as I remember OpenAI does it too even when using the API. Their reason is fraud and harmful behaviour detection. But let's be honest, does it really matter? Building a successful product does depend on so much more than the technical implementation and brainstorming you do with Fable, Mythos or any model.
They can start with 30 days, send a notice later on change in policy. Then forget to delete it and use it forever

Has this pattern not been possible to stop at all?

This kills the legal use-case. Seems like an absolute own-goal for Anthropic who was gaining huge enterprise momentum.
loading story #48491415
Didn’t they all but admit they’ve been storing and actively looking at requests with this post: https://www.anthropic.com/news/detecting-and-preventing-dist... ?

If they weren’t storing, they’d be oblivious to what customers are doing, making this kind of detection impossible. What data did they train their classifier on, if not real user (distiller) traffic?

{"deleted":true,"id":48485347,"parent":48485303,"time":1781142917,"type":"comment"}
Why can’t they have trained the classifier on internal red teaming?
They basically said "Deepseek ran 150,000 requests and here's the gist of one of their prompts". Anthropic doesn't know which accounts are Deepseek proxies beforehand, so definitely sounds like retrospective analysis of broad user logs to me.

Of course Anthropic realizes saying this straight is problematic so they said they examined request metadata, but no, I don't think they can get this kind of insight from metadata (token counts, request time, etc.)

I'm sick of the American frontier labs. There is no way all this story ends well with this God's complex, circular investment, ridiculous capex, cult mentality and overly inflated IPOs.
Given the model intelligence plateau and public data exhaustion the only way to improve in customer use cases is by training the model on customer data.
If this is true, than Anthropic, Google and maybe OpenAI models will keep getting better and better and everyone else will be left in the dust - as they won't have access to so much customer data.
China has proxies that sell cheaper access to frontier models in exchange for permission to train on your data.
I guess everything is open source now (for anthropic).
« Trust us, we’re doing this for the good of humanity » (fills pockets with stock value and externalities from data center polloution) « No seriously trust us , at least we’re not Sam Altman »

Update: « Oh and we’re the only ones who will stop AI from turning into SkyNet and eating your babies, you just have to pay us to make sure we invent SkyNet first »

Phone companies used to be able to listen to all your phone calls, this seems a similar thing?
So... because of risk of retaliatory litigation I have to sit on vuln reports for one month while black hats are free to roam.
I enjoyed seeing all the 'privacy notice' emails in my inbox today thanks to this
Privacy is forbidden.

Everything you do will be used against you in court if required.

the grooming (marketing) game is strong with anthropic
I remember the "Don't be evil" days from Google. At some point most morals change with enough money.
the real risk is using it at all as you are already sending them your data. If you are ok with that, then this retention/review seems ok.
There were two (expensive) exceptions / alternatives so far: Bedrock and Vertex. Their Zero Data Retention was in fact contractually enforced. Now it is all f...d because of these morons at Anthropic. For now I am better off just using DS via their API.

This is just a tragic moment for Tech. We just killed AI privacy. OpenAI already follows this trend and others will do too.

The only hope now is ... tada .. Mistral LOL

Hmmm no? The only way is to deploy your own local model, using anyone else's you are at their whim on what happens to your data.
loading story #48491153
It’s not binary. With AWS previously you have contractual guarantees with a third party, that’s been in business for a couple decades, which explicitly state zero seconds of data retention - only as long as needed for inference.

Consider the security angle too. You now have to rely on Anthropic’s infrastructure security. You did not previously when you used Bedrock/Vertex/etc.

From a personal use perspective yes, the big issue here is enterprise and existing contracts as surely most companies will have signed zero retention.
Just a play to get more data
Lawyers are gonna be making this a legal quagmire for years. Even after it gets retracted.
why would anyone assume anything else than that they keep it forever?
This could be a big issue for firms with strict GDPR criteria: "This change only applies to organizations that have set up workspaces with zero data retention (ZDR) in Claude Console, use Claude Code with ZDR in Claude Enterprise, or access Claude through AWS Bedrock, Google Cloud Agent Platform, or Microsoft Foundry with ZDR. The rest of this article applies only to these organizations."
All I can say to my team (and my clients): "f...k Anthropic". They've just put both Bedrock and Vertex on slippery slope of "we don't collect your prompts. period. ... comma ... except ..."

Right now we have changed the code of all our agents to data retention mode 'none' (Note: not "default" or "inherited", this is not enough now!) and we are fighting with GCP doco to set similar things for Vertex.

This is just terrible.

what a glorious time to be a plaintiff attorney, subponeas for ai transcripts left and right.
Then don’t use it.
That’s exactly what my employer had communicated. It will not be allowed.
Step 1: Find all companies which refuses/bans to use SOTA models from irrational fear.

Step 2: Use SOTA models to copy them and crush them

Step 3: Profit.

(Yes, not every business is easily replicable, but you sure can find some)

This. And AI labs seem to be above IP / Copyright law and absolutely nothing will happen to them when they grab all the data and package it up.
Until all of your interactions are trained into future model releases, and another competitor steps in and takes all your "R&D" straight out of the model.

Now it's open season for literally anyone.

This policy change doesn't allow training, just like the previous one.
Step 4, get sued because you violated an NDA or other regulation?
I'm not talking about Claude copying.

I'm talking about scouring Twitter/LinkedIn and look at posts from employees who say SOTA model is banned. Look at what the business do. Copy it using SOTA. Call their clients with 30% discount and faster turnaround and higher quality product.

It is complicated, but I can get Private Equity of even VCs to fund this idea.

tl;dr -- I'm actually agreeing with you. Anthropic will never copy your business model due to NDA. But there are plenty of fearmongering about they copying you and because of which you won't use their models. If their models are genuinely SOTA you can use that information to your advantage and crush scaredy-cats.

Edit: The fact that these get downvoted is exactly the reason why it's easy to win

The thing is, just like employees at non-AI forward companies "cheat," by using their personal Claude.ai and ChatGPT.com, so will big companies, or at least some teams/departments regarding this Fable issue. LLMs might be new, but it is known that this kind of behavior is classic.

As you said, if they don't, they will be easy pickings.

To be very clear, I ain't that guy. So, if this is true, I might be somewhat easy pickins myself. But, well known trust is a huge part of our org's value prop with our clients. God this sucks.
Can you name a single example of a business that has been replaced by another business leveraging LLMs to copy and "crush" their software?
Pretty much any Chinese business. (Except takeouts and laundries)
{"deleted":true,"id":48485170,"parent":48484183,"time":1781141244,"type":"comment"}
I mean, this is the biggest reason that's my employer's position
loading story #48491410
Reminder: FISA Section 702, aka FAA702, aka PRISM, aka the #1 most used collection source by the US IC, allows *warrantless* realtime access for the US federal government to everything Anthropic, OpenAI, Google, Apple, Microsoft, Amazon, and Meta have on you.
Thank you. That completes the picture for me.
That should be higher
I am definitely for services respecting customer privacy, but I can't help if this is different. I recently saw a thread where a person was bragging that frontier providers were blocking their attempt at what looked like to be social media de-anonymization and blackmailing app.

Maybe this isn't different than using something like Google Sheets to keep a list of people to dox and blackmail, but the leverage certainly makes it feel different.

{"deleted":true,"id":48484007,"parent":48464258,"time":1781132715,"type":"comment"}
I mean not just the part 30 days data retention but I think the serious trade of this product is just the token efficiency. They trade it for precision. The claims that they make that it found a 30 year software bug from millions of lines of code is just precision. To human it's looks like a lot but for it it's just the ablity to process (token processing). Let's see how long it runs. Peace.
Does *anybody* believe their weasel words? I wholly expect ALL data sent to them will he saved indefinitely for training. And I mean all. Voice, text, pictures, scraped websites. You name it.

All the LLM vendors are the biggest commercial pirates ever known. And they got away with it. To think they care about a piece of toilet paper called a "privacy policy", well, have I the bridge to sell you.

I actually think that’s warranted. And if you used it to poke around, you would also agree.
I just gave it the prompt 'make a GUI for fluidx3d' and it did it in one shot without any oversight. It is incredible.
> And if you used it to poke around, you would also agree.

Would you elaborate? Not sure what you're describing

All he pre-publicity from Anthropic was about how it was amazing at finding security vulnerabilities, so it's not a stretch to think that some people would want to exploit that for nefarious purposes.
Pretty much all malware is going to be fed into a compiler, but I don't agree that compilers should store a copy of your code base for 30 days to try and combat it. Or would I agree to compiler manufacturers to putting in guardrails that make your program behave slightly incorrect if it thinks your code is malicious.
loading story #48491846
What an annoying company, I wish it didn't exist..