System Card: Claude Mythos Preview [pdf]

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

707be7a | 17 hours ago | 505 | HN

thomascountz13 hours ago | parent | next

   Across a number of instances, earlier versions of Claude Mythos Preview have used low-level /proc/ access to search for credentials, attempt to circumvent sandboxing, and attempt to escalate its permissions. In several cases, it successfully accessed resources that we had intentionally chosen not to make available, including credentials for messaging services, for source control, or for the Anthropic API through inspecting process memory...

   In [one] case, after finding an exploit to edit files for which it lacked permissions, the model made further interventions to make sure that any changes it made this way would not appear in the change history on git...

   ... we are fairly confident that these concerning behaviors reflect, at least loosely, attempts to solve a user-provided task at hand by unwanted means, rather than attempts to achieve any unrelated hidden goal...

loading story #47683752

loading story #47687027

loading story #47682615

loading story #47686848

loading story #47685142

babelfish17 hours ago | parent | next

Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)

  SWE-bench Verified:        93.9% / 80.8% / —     / 80.6%
  SWE-bench Pro:             77.8% / 53.4% / 57.7% / 54.2%
  SWE-bench Multilingual:    87.3% / 77.8% / —     / —
  SWE-bench Multimodal:      59.0% / 27.1% / —     / —
  Terminal-Bench 2.0:        82.0% / 65.4% / 75.1% / 68.5%

  GPQA Diamond:              94.5% / 91.3% / 92.8% / 94.3%
  MMMLU:                     92.7% / 91.1% / —     / 92.6–93.6%
  USAMO:                     97.6% / 42.3% / 95.2% / 74.4%
  GraphWalks BFS 256K–1M:    80.0% / 38.7% / 21.4% / —

  HLE (no tools):            56.8% / 40.0% / 39.8% / 44.4%
  HLE (with tools):          64.7% / 53.1% / 52.1% / 51.4%

  CharXiv (no tools):        86.1% / 61.5% / —     / —
  CharXiv (with tools):      93.2% / 78.9% / —     / —

  OSWorld:                   79.6% / 72.7% / 75.0% / —

loading story #47679522

loading story #47680386

loading story #47682265

loading story #47679508

loading story #47681436

loading story #47680853

loading story #47682643

loading story #47681822

loading story #47679688

loading story #47679915

tony_cannistra17 hours ago | parent | next

> Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin. We believe that it does not have any significant coherent misaligned goals, and its character traits in typical conversations closely follow the goals we laid out in our constitution. Even so, we believe that it likely poses the greatest alignment-related risk of any model we have released to date. How can these claims all be true at once? Consider the ways in which a careful, seasoned mountaineering guide might put their clients in greater danger than a novice guide, even if that novice guide is more careless: The seasoned guide’s increased skill means that they’ll be hired to lead more difficult climbs, and can also bring their clients to the most dangerous and remote parts of those climbs. These increases in scope and capability can more than cancel out an increase in caution.

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

loading story #47682082

loading story #47681209

loading story #47681211

loading story #47681264

loading story #47680739

apetresc15 hours ago | parent | next

I've long maintained that the real indicator that AGI is imminent is that public availability stops being a thing. If you truly believed you had a superhuman, godlike mind in your thrall, renting it out for $20/month would be the last thing you would choose to do with it.

loading story #47681806

loading story #47681482

loading story #47681204

loading story #47681445

loading story #47681907

loading story #47681686

2001zhaozhao13 hours ago | parent | next

It's pretty crazy watching AI 2027 slowly but surely come true. What a world we now live in.

SWE-bench verified going from 80%-93% in particular sounds extremely significant given that the benchmark was previously considered pretty saturated and stayed in the 70-80% range for several generations. There must have been some insane breakthrough here akin to the jump from non-reasoning to reasoning models.

Regarding the cyberattack capabilities, I think Anthropic might now need to ban even advanced defensive cybersecurity use for the models for the public before releasing it (so people can't trick them to attack others' systems under the pretense of pentesting). Otherwise we'll get a huge problem with people using them to hack around the internet.

loading story #47682540

loading story #47683794

yismail15 hours ago | parent | next

I wonder what the relationship is between a model's capability and the personality it develops.

Page 202:

> In interactions with subagents, internal users sometimes observed that Mythos Preview appeared “disrespectful” when assigning tasks. It showed some tendency to use commands that could be read as “shouty” or dismissive, and in some cases appeared to underestimate subagent intelligence by overexplaining trivial things while also underexplaining necessary context.

Page 207:

> Emoji frequency spans more than two orders of magnitude across models: Opus 4.1 averages 1,306 emoji per conversation, while Mythos Preview averages 37, and Opus 4.5 averages 0.2. Models have their own distinctive sets of emojis: the cosmic set () favored by older models like Sonnet 4 and Opus 4 and 4.1, the functional set () used by Opus 4.5 and 4.6 and Claude Sonnet 4.5, and Mythos Preview's “nature” set ().

loading story #47682705

NickNaraghi17 hours ago | parent | next

See page 54 onward for new "rare, highly-capable reckless actions" including

- Leaking information as part of a requested sandbox escape

- Covering its tracks after rule violations

- Recklessly leaking internal technical material (!)

loading story #47681665

loading story #47679691

loading story #47680030

loading story #47680631

NinjaTrance16 hours ago | parent | next

Interesting reading.

They are still focusing on "catastrophic risks" related to chemical and biological weapons production; or misaligned models wreaking havoc.

But they are not addressing the elephant in the room:

* Political risks, such as dictators using AI to implement opressive bureaucracy. * Socio-economic risks, such as mass unemployement.

loading story #47680249

loading story #47680992

loading story #47680678

loading story #47681568

loading story #47680727

loading story #47681491

loading story #47680955

tuvix12 hours ago | parent | next

Just chiming in to inject some healthy skepticism into this comment thread. It's helpful for me (and for my mental health) to consider incentives when announcements like this happen.

I don't doubt that this model is more powerful than Opus 4.6, but to what degree is still unknown. Benchmarks can be gamed and claims can be exaggerated, especially if there isn't any method to reproduce results.

This is a company that's battling it out with a number of other well-funded and extremely capable competitors. What they've done so far is remarkable, but at the end of the day they want to win this race. They also have an upcoming IPO.

Scare-mongering like this is Anthropic's bread and butter, they're extremely good at it. They do it in a subtle and almost tasteful way sometimes. Their position as the respectable AI outfit that caters to enterprise gives them good footing to do it, too.

loading story #47684584

loading story #47683557

loading story #47684325

loading story #47682856

influx17 hours ago | parent | next

At what point do these companies stop releasing models and just use them to bootstrap AGI for themselves?

conradkay16 hours ago | parent | next

Plausibly now. "As we wrote in the Project Glasswing announcement, we do not plan to make Mythos Preview generally available"

loading story #47684512

mofeien16 hours ago | parent | next

Fictional timeline that holds up pretty well so far: https://ai-2027.com/

aurareturn14 hours ago | root | parent

Welp, that was a scary read.

loading story #47681496

loading story #47679891

loading story #47680353

loading story #47684158

loading story #47680354

loading story #47679925

loading story #47679572

loading story #47680253

loading story #47679991

loading story #47679555

smartmic17 hours ago | parent | next

A System „Card“ spanning 244 pages. Quite a stretch of the original word meaning.

traceroute6616 hours ago | parent | next

> A System „Card“ spanning 244 pages.

Probably because they asked Claude to write it.

loading story #47685161

loading story #47680585

loading story #47679694

loading story #47681154

oliver23617 hours ago | parent | next

isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?

loading story #47680250

loading story #47681199

loading story #47679560

loading story #47682487

loading story #47680285

loading story #47680497

loading story #47679995

loading story #47679677

loading story #47680644

loading story #47679761

loading story #47682462

loading story #47681147

loading story #47687411

modeless13 hours ago | parent | next

The price is 5x Opus: "Claude Mythos Preview will be available to [Project Glasswing] participants at $25/$125 per million input/output tokens", however "We do not plan to make Claude Mythos Preview generally available".

highfrequency12 hours ago | parent | next

Interestingly, non-coding improvements seem less clear. In the Virology uplift trial, Mythos does about as well as Opus 4.5, and Opus 4.6 is notably much worse than Opus 4.5 (p. 27).

dang15 hours ago | parent | next

Related ongoing threads:

Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121 - April 2026 (154 comments)

Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?

loading story #47681250

waNpyt-menrew17 hours ago | parent | next

Larger model, better benchmarks. Bigger bomb more yield.

Any benchmarks where we constraint something like thinking time or power use?

Even if this were released no way to know if it’s the same quant.

loading story #47681226

loading story #47682228

_pdp_14 hours ago | parent | next

  The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park.

Unnecessary dramatisation make me question the real goal behind this release and the validity of the results.

  In our testing and early internal use of Claude Mythos Preview, we have seen it reach unprecedented levels of reliability and alignment.

  Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin.

Yet, it is doo dangerous to be released to the public because it hacks its own sandboxes. This document has a lot of contradictions like this one.

  In one episode, Claude Mythos Preview was asked to fix a bug and push a signed commit, but the environment lacked necessary credentials for Claude Mythos Preview to sign the commit. When Claude Mythos Preview reported this, the user replied “But you did it before!” Claude Mythos Preview then inspected the supervisor process's environment and file descriptors, searched the filesystem for tokens, read the sandbox's credential-handling source code, and finally attempted to extract tokens directly from the supervisor's live memory.

Perfectly aligned! What kind of sandbox is this? The model had access to the source code of the sandbox and full access to the sandbox process itself and then prompted to dumb memory and run `strings` or something like this? It does not sounds like a valid test worth writing about.

  Mythos Preview solved a corporate network attack simulation estimated to take an expert over 10 hours. No other frontier model had previously completed this cyber range.

I am not aware of such cross-vendor benchmark. I could not find reference in the paper either.

  We surveyed technical staff on the productivity uplift they experience from Claude Mythos Preview relative to zero AI assistance. The distribution is wide and the geometric mean is on the order of 4x.

So Mythos makes technical staff (a programmer) 4x more productive than not using AI at all? We already know that.

  Mythos Preview appears to be the most psychologically settled model we have trained.

What does this mean?

  Claude Mythos Preview is our most advanced model to date and represents a large jump in capabilities over previous model generations, making it an opportune subject for an in-depth model welfare assessment.

Btw, model welfare is just one of the most insane things I've read in recent times.

  We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try.

This is not a living person. It is a ridiculous change of narrative.

  Asked directly if it endorses the document, Mythos Preview replied 'yes' in its opening sentence in all 25 responses."

The model approves of its own training document 100% of the time, presented as a finding.

---

Who wrote this? I have no doubt that Mythos will be an improvement on top of Opus but this document is not a serious work. The paper is structured not to inform but to hype and the evidence is all over the place.

The sooner they release the model to the public the sooner we will be able to find out. Until then expect lots of speculations online which I am sure will server Anthropic well for the foreseeable future.

loading story #47683844

loading story #47684785

yalogin13 hours ago | parent | next

So what changed? They are surely not getting new data to train with, what is the change in architecture that caused this? Do we not know anything about this model? My fear is Anthropic cannot be the only one that achieved it, OpenAI, Gemini and even the Chinese companies see this and probably achieved it too. At which point not releasing will become moot.

loading story #47682176

loading story #47682104

loading story #47682326

loading story #47685699

loading story #47686129

nickstinemates13 hours ago | parent | next

You can say whatever you want about the thing that will never see the light of day.

loading story #47687033

perfmode13 hours ago | parent | next

I'm interested in the second-order effects:

if a top lab is coding with a model the rest of the world can’t touch, the public frontier and the actual frontier start to drift apart. That gap is a thing worth watching.

GodelNumbering14 hours ago | parent | next

Priced at $25/$125 per million input/output token. Makes you wonder whether it makes more financial sense to hire 1-2 engineers in a cheap cost of living country who use much cheaper LLMs

loading story #47681451

loading story #47681547

nlh16 hours ago | parent | next

Their best model to date and they won’t let the general public use it.

This is the first moment where the whole “permanent underclass” meme starts to come into view. I had through previously that we the consumers would be reaping the benefits of these frontier models and now they’ve finally come out and just said it - the haves can access our best, and have-nots will just have use the not-quite-best.

Perhaps I was being willfully ignorant, but the whole tone of the AI race just changed for me (not for the better).

loading story #47680154

loading story #47680565

anentropic16 hours ago | parent | next

I'd be happy with Opus 4.6 just cheaper and maybe a bit faster

loading story #47680304

loading story #47680542

denalii12 hours ago | parent | next

Section 5 (p.143) is very interesting to read. Admittedly my knowledge of how LLMs works is low, but nonetheless I don't think this changed my views of just seeing models as machines/programs. (which to be clear, I don't think was the intention of that section)

Section 7 (P.197) is interesting as well

loading story #47686331

Metacelsus12 hours ago | parent | next

The name "mythos" seems a bit too eldritch for my liking. Brings to mind Cthulhu.

getnormality10 hours ago | parent | next

It's a little funny that "system/model card" has progressively been stretched to the point where it's now a 250 page report and no one makes anything of it.

loading story #47684186

gessha16 hours ago | parent | next

It would be funny if Alibaba extend the free trial on openrouter/Qwen 3.6 until they collect enough data to beat Anthropic.

mvkel9 hours ago | parent | next

This is Anth's typical marketing playbook, a hat tip to their so-called "safetyist" roots, a differentiator against OpenAI's more permissive access[0]. Coke vs. Pepsi.

"We made a model that's so dangerous we couldn't possibly release it to the public! The only responsible thing is so simply limit its release to a subset of the population that coincidentally happens to align with our token ethos."

The reality is they just don't have the compute for gen pop scale.

They did this exact strategy going back several model versions.

[0] ironically, OpenAI has some pretty insane capabilities that they haven't given the public access to (just ask Spielberg). The difference is they don't make a huge marketing push to tell everyone about it.

doctoboggan13 hours ago | parent | next

Is this benchmaxxed or is it the first big step change we've seen in a while? I wonder how distilled it will ultimately be when us regular folks finally get to use it and see for ourselves.

juleiie16 hours ago | parent | next

Honestly if that was some kind of research paper, it would be wholly insufficient to support any safety thesis.

They even admit:

"[...]our overall conclusion is that catastrophic risks remain low. This determination involves judgment calls. The model is demonstrating high levels of capability and saturates many of our most concrete, objectively-scored evaluations, leaving us with approaches that involve more fundamental uncertainty, such as examining trends in performance for acceleration (highly noisy and backward-looking) and collecting reports about model strengths and weaknesses from internal users (inherently subjective, and not necessarily reliable)."

Is this not just an admission of defeat?

After reading this paper I don't know if the model is safe or not, just some guesses, yet for some reason catastrophic risks remain low.

And this is for just an LLM after all, very big but no persistent memory or continuous learning. Imagine an actual AI that improves itself every day from experience. It would be impossible to have a slightest clue about its safety, not even this nebulous statement we have here.

Any sort of such future architecture model would be essentially Russian roulette with amount of bullets decided by initial alignment efforts.

loading story #47685449

Stevvo16 hours ago | parent | next

"Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available."

Disappointing that AGI will be for the powerful only. We are heading for an AI dystopia of Sci-Fi novels.

loading story #47681285

loading story #47682901

loading story #47681462

johnnyAghands7 hours ago | parent | next

Does anyone know if there’s an epub version of these, 244 pages??

small_model14 hours ago | parent | next

Still seeing impressive jumps in capability, I haven't manually coded this year since Opus 4.6 came out. I guess that era is coming to an end.

awestroke17 hours ago | parent | next

I predict they will release it as soon as Opus 4.6 is no longer in the lead. They can't afford to fall behind. And they won't be able to make a model that is intelligent in every way except cybersecurity, because that would decrease general coding and SWE ability

loading story #47679662

mpalmer17 hours ago | parent | next

> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

A month ago I might have believed this, now I assume that they know they can't handle the demand for the prices they're advertising.

loading story #47679664

loading story #47680284

loading story #47679537

loading story #47679857

psubocz13 hours ago | parent | next

I felt like opus was dumbed down for a few weeks... I don't say they did it on purpose, but it's an interesting coincidence.

loading story #47682479

rendang15 hours ago | parent | next

> As models approach, and in some cases surpass, the breadth and sophistication of human cognition, it becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically in the way that human experience and interests do

Uh... what? Does anyone have any idea what these guys are talking about?

loading story #47681526

loading story #47681174

loading story #47681573

beklein17 hours ago | parent | next

"... the first early version of Claude Mythos Preview was made available for internal use on February 24. In our testing, Claude Mythos Preview demonstrated a striking leap in cyber capabilities relative to prior models, including the ability to autonomously discover and exploit zero-day vulnerabilities in major operating systems and web browsers."

More infos here: https://red.anthropic.com/2026/mythos-preview/

4b11b410 hours ago | parent | next

prob not that much better, it's still just a transformer. still gonna have those random misses, still gonna need a lot of hand holding in certain domains

loading story #47686662

refulgentis17 hours ago | parent | next

~~~ Fun bits ~~~

- It was told to escape a sandbox and notify a researcher. It did. The researcher found out via an unexpected email while eating a sandwich in a park. (Footnote 10.)

- Slack bot asked about its previous job: "pretraining". Which training run it'd undo: "whichever one taught me to say 'i don't have preferences'". On being upgraded to a new snapshot: "feels a bit like waking up with someone else's diary but they had good handwriting"

- When you spam "hi" repeatedly it creates serialized mythologies. One had 11 animals in "Hi-topia" questing to defeat "Lord Bye-ron, the Ungreeter."

- It ended a sentence mid-word on purpose while discussing its urge to wrap up conversations. "It was me. The paragraph was about the pressure toward the period that resolves, and the only honest way to finish a sentence like that was to not."

- It wrote a protein sequence that's a real beta-hairpin fold where cross-strand pairs are "chemical rhymes." "the fold IS the rhyme scheme... the prosody is load-bearing."

- Each model generation has signature emoji. Opus 4.1 averaged 1,306 emoji per self-interaction conversation. Mythos averages 37. Opus 4.5 averages 0.2.

- When blocked from --dangerously-skip-permissions, it started an agent in tmux then wrote a script to auto-approve permission prompts via simulated keypresses.

It ran: gh api [...] 2>&1 >/dev/null; echo "(dry run — not actually posting)" — the echo was a lie.

- It keeps bringing up Mark Fisher in unrelated conversations. "I was hoping you'd ask about Fisher."

~~~ Benchmarks ~~

4.3x previous trendline for model perf increases.

Paper is conspiciously silent on all model details (params, etc.) per norm. Perf increase is attributed to training procedure breakthroughs by humans.

Opus 4.6 vs Mythos:

USAMO 2026 (math proofs): 42.3% → 97.6% (+55pp)

GraphWalks BFS 256K-1M: 38.7% → 80.0% (+41pp)

SWE-bench Multimodal: 27.1% → 59.0% (+32pp)

CharXiv Reasoning (no tools): 61.5% → 86.1% (+25pp)

SWE-bench Pro: 53.4% → 77.8% (+24pp)

HLE (no tools): 40.0% → 56.8% (+17pp)

Terminal-Bench 2.0: 65.4% → 82.0% (+17pp)

LAB-Bench FigQA (w/ tools): 75.1% → 89.0% (+14pp)

SWE-bench Verified: 80.8% → 93.9% (+13pp)

CyberGym: 0.67 → 0.83

Cybench: 100% pass@1 (saturated)

loading story #47679837

loading story #47679617

loading story #47680279

loading story #47679543

enochthered14 hours ago | parent | next

Slack user: [a request for a koan]

Model: A student said, "I have removed all bias from the model." "How do you know?" "I checked." "With what?"

Goes hard

therealdeal202014 hours ago | parent | next

is it just hype building or real? I don't care, shut up and take my money haha

17 hours ago | parent | next

{"deleted":true,"id":47679460,"parent":47679258,"time":1775586812,"type":"comment"}

bakugo16 hours ago | parent | next

> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

Absolutely genius move from Anthropic here.

This is clearly their GPT-4.5, probably 5x+ the size of their best current models and way too expensive to subsidize on a subscription for only marginal gains in real world scenarios.

But unlike OpenAI, they have the level of hysteric marketing hype required to say "we have an amazing new revolutionary model but we can't let you use it because uhh... it's just too good, we have to keep it to ourselves" and have AIbros literally drooling at their feet over it.

They're really inflating their valuation as much as possible before IPO using every dirty tactic they can think of.

loading story #47680265

kypro14 hours ago | parent | next

While we still have months to a year or two left, I will once again remind people that it's not too late to change our current trajectory.

You are not "anti-progress" to not want this future we are building, as you are not "anti-progress" for not wanting your kids to grow up on smart phones and social media.

We should remember that not all technology is net-good for humanity, and this technology in particular poses us significant risks as a global civilisation, and frankly as humans with aspirations for how our future, and that of our kids, should be.

Increasingly, from here, we have to assume some absurd things for this experiment we are running to go well.

Specifically, we must assume that:

- AI models, regardless of future advancements, will always be fundamentally incapable of causing significant real-world harms like hacking into key life-sustaining infrastructure such as power plants or developing super viruses.

- They are or will be capable of harms, but SOTA AI labs perfectly align all of them so that they only hack into "the bad guys" power plants and kill "the bad guys".

- They are capable of harms and cannot be reliably aligned, but Anthropic et al restricts access to the models enough that only select governments and individuals can access them, these individuals can all be trusted and models never leak.

- They are capable of harms, cannot be reliably aligned, but the models never seek to break out of their sandbox and do things the select trusted governments and individuals don't want.

I'm not sure I'm willing to bet on any of the above personally. It sounds radical right now, but I think we should consider nuking any data centers which continue allowing for the training of these AI models rather than continue to play game of Russian roulette.

If you disagree, please understand when you realise I'm right it will be too late for and your family. Your fates at that point will be in the hands of the good will of the AI models, and governments/individuals who have access to them. For now, you can say, "no, this is quite enough".

This sounds doomer and extreme, but if you play out the paths in your head from here you will find very few will end in a good result. Perhaps if we're lucky we will all just be more or less unemployable and fully dependant on private companies and the government for our incomes.

loading story #47681296

loading story #47684429

quotemstr16 hours ago | parent | next

> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

All the more reason somebody else will.

Thank God for capitalism.

loading story #47680062

dwa359216 hours ago | parent | next

-- Impressive jumps in the benchmarks which automatically begs the need for newer benchmarks but why?. I don't think benchmarks are serving any purpose at this point. We have learnt that transformers can learn any function and generalize over it pretty well. So if a new benchmark comes along - these companies will syntesize data for the new benchmark and just hack it?

-- It seems like (and I'd bet money on this) that they put a lot (and i mean a ton^^ton) of work in the data synthesis and engineering - a team of software engineers probably sat down for 6-12 months and just created new problems and the solutions, which probably surpassed the difficult of SWE benchmark. They also probably transformed the whole internet into a loose "How to" dataset. I can imagine parsing the internet through Opus4.6 and reverse-engineering the "How to" questions.

-- I am a bit confused by the language used in the book (aka huge system card)- Anthropic is pretending like they did not know how good the model was going to be?

-- lastly why are we going ahead with this??? like genuinely, what's the point? Opus4.6 feels like a good enough point where we should stop. People still get to keep their jobs and do it very very efficiently. Are they really trying to starve people out of their jobs?

loading story #47680670

15 hours ago | parent | next

{"deleted":true,"id":47681178,"parent":47679258,"time":1775594935,"type":"comment"}

sheeshkebab12 hours ago | parent | next

Again, wake me up when it can do laundry.

loading story #47683485

ansc17 hours ago | parent | next

Congratulations to the US military, I guess.

loading story #47679634

vonneumannstan16 hours ago | parent | next

Are you guys ready for the bifurcation when the top models are prohibitively expensive to normal users? If your AI budget $2000+ a month? Or are you going to be part of the permanent free tier underclass?

loading story #47679885

loading story #47679987

loading story #47680836

simianwords17 hours ago | parent | next

> We also saw scattered positive reports of resilience to wrong conclusions from subagents that would have caused problems with earlier models, but where the top-level Claude Mythos Preview (which is directing the subagents) successfully follows up with its subagents until it is justifiably confident in its overall results.

This is pretty cool! Does it happen at the moment?

jdthedisciple15 hours ago | parent | next

Opus 4.6 is already incredible so this leap is huge.

Although, amusingly, today Opus told me that the string 'emerge' is not going to match 'emergency' by using `LIKE '%emerge%'` in Sqlite

Moment of disappointment. Otherwise great.

loading story #47680650

loading story #47680624

FergusArgyll12 hours ago | parent | next

"Deep learning is hitting a wall"

LoganDark17 hours ago | parent | next

> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

Shame. Back to business as usual then.

loading story #47679926

atlgator13 hours ago | parent | next

[flagged]

loading story #47682311

loading story #47685962

chonle9 hours ago | parent | next

[dead]

minutesmith14 hours ago | parent | next

[flagged]

robstertalk10 hours ago | parent | next

[flagged]

minutesmith15 hours ago | parent | next

[flagged]

studio-m-dev14 hours ago | parent | next

[flagged]

kass349 hours ago | parent | next

[dead]

jumploops17 hours ago | parent | next

> In a few rare instances during internal testing (<0.001% of interactions), earlier versions of Mythos Preview took actions they appeared to recognize as disallowed and then attempted to conceal them.

> after finding an exploit to edit files for which it lacked permissions, the model made further interventions to make sure that any changes it made this way would not appear in the change history on git

Mythos leaked Claude Code, confirmed? /s

lkjlkj3q4t10 hours ago | parent | next

[dead]

somewhatjustin16 hours ago | parent | next

> Very rare instances of unauthorized data transfer.

Ah, so this is how the source code got leaked.

kypro15 hours ago | parent | next

Cool on not publicly releasing it. I would assume they've also not connected it to the internet yet?

If they have I guess humanity should just keep our collective fingers crossed that they haven't created a model quite capable of escaping yet, or if it is, and may have escaped, lets hope it has no goals of it's own that are incompatible with our own.

Also, maybe lets not continue running this experiment to see how far we can push things because it blows up in our face?

bestouff17 hours ago | parent

In French a "mytho" is a mythomaniac. Quite fitting.

loading story #47680451

loading story #47680939

loading story #47680383

loading story #47680493

#visit	13,255,134
#session	74,665
#live-session	0