Hacker News new | past | comments | ask | show | jobs | submit
It seems like Fable will refuse to do any work when it comes to developing LLMs or even asking questions about topics related to LLM. Simple things like asking to explain a paper fails!

From the model card:

In light of the ability of recent models to accelerate their own development, we've implemented new interventions that limit Claude's effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design. Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user.

I was wondering when something like this would happen. I got my first and only two content violation warnings in Claude Code last week when asking it about something ML related. It was a real head scratcher because I couldn’t figure out what about the requests could have violated anything.

Might be worth going back and taking a harder look at what I was asking it about if it somehow triggered a “forbidden knowledge” alert. Or maybe it was just a random bug.

"for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design"

Oh man all of those runaway infrastructure buildouts by our agents trying to achieve singularity...

Just say you don't want to lower the bar for others to compete

> frontier LLM development

This seems so wide reaching if it's catching simple things like explaining a paper. Does this also refuse to help with any already developed training pipelines?

I can kind of understand the generation of synthetic data, but nerfing the assistance of training pipelines just seems like a really shitty thing to do.

I wanted to try on my biology research and it refused to talk about it and proxied to 4.8. Really, only surface level conversations about topics of interest. I know this is not a topic of broad and mass interest, but limiting it for topics like that and machine learning will probably do change how I use it.
loading story #48469606
Yes, this stuff is really annoying when it misfires. I've had all my subsequent ChatGPT conversations biohazard-contained for several days for the crime of asking it to explain a gene drive to me.
Is it certain or all advanced topics? I'm curious if it bans questions about quantum computing or fusion.
So insane to me that these ai companies are perfectly fine trying their absolute best to automate as much knowledge work as possible but as soon as this capability can be turned on them they start implementing hidden interventions to sabotage anyone trying to beat them at their own game.
This is just marketing that Anthropic is building the singularity.
Let's hope not all frontier AI assimilates these guardrails. It would be a shame for independent researchers and students.
{"deleted":true,"id":48466541,"parent":48466487,"time":1781033973,"type":"comment"}
Singularity for me but not for thee.
you will RENT the singularity
loading story #48468243
"we should put on hold the development of AI because the world is not ready for it"

Yeah... We need open models so we don't have that BS.

This is super annoying and imo, really limits the usefulness of this model. It speaks volumes about what Anthropic's position as a company and its priorities will be going forward. I doubt this kind of gatekeeping will prevent open-models or other innovation outside Anthropic to slow down. I would imagine these guardrails, if needed at all, should be done at a legal framework level and students should not be a part of this blanket approach to limiting the usage of these models.
Anthropic probably trained Mythos on their own code and found that it is too got at reproducing it.
I doubt that. Why would you train Mythos on its own code if you don't want it to be able to reproduce it? It's not going to add much to the overall corpus.
loading story #48467355
That's strange... I've been tinkering with a little LLM-from-scratch project for a while now, and Fable is just continuing it without a problem
It also tried to force usage the paid Claude API instead of claude code usage just because there's a mention of another provider we might want to plug in (which hasnt even happened) for AI integration.
Ha funny, I was speccing out an idea for real time Claude code interaction from local apps using some tricks vs using the agent sdk when I got the popup to try Fable. So of course I gave it a go, and it triggered the sensitive content warning immediately, which I was very confused by until I put two and two together.

Fun times when “safety” means both the safety of mankind, and also the safety of revenues

Anthropic is really speedrunning their evil arc as fast as possible. Can't use them for basic LLM research, cybersecurity, or beyond-surface-level discussions of biology and virology, but Anthropic is allowed to sell Claude to the trump administration to kidnap maduro and to bomb iran. And don't get me started on that $100M autonomous killer drone swarm contract that they applied to and rationalized as non autonomous...
Didn’t Anthropic famously refuse to work with the US gov on military applications that would violate its safeguards?

https://apnews.com/article/anthropic-pentagon-ai-hegseth-dar...

> Can't use them for basic LLM research, cybersecurity, or beyond-surface-level discussions of biology and virology

Your priorities are not everyone else's priorities. The people concerned about AI extinction risk list those as three of their biggest priorities for AI to not do. Those are the people whose culture Anthropic descends from, and by their measure, those exclusions make this the least evil path.

loading story #48468222
loading story #48468837