AI agent runs amok in Fedora and elsewhere
https://lwn.net/SubscriberLink/1077035/c7e7c14fbd60fae9/This is deeply scary, not because "agents are running amok" but because a huge amount of our infrastructure is vulnerable to this kind of attack, and if bad people are utilising LLM agents to carry them out, we're in for a wild ride over the next few years.
Is this confirmed? There is the message from somebody claiming to be the original contributer claiming to have been hacked, but that was weird (1 h old github account) so other scenarios seem possible
a) really a agent going off the rails
b) the contributer trying to cover up that he let an agent run wild and now made more misstakes along the way
So yes, it seems like an attack to me, but it is far from clear what really happened.
> "So not saying this was it, but an AI agent automated attempt at a Xz like compromise might really look very similar what we have just seen here."
Without identifying and interviewing the attacker we can't confirm that's what they intended, and there's a possibility that it was just incompetence/ignorance/whatever, but we should probably treat it as an attempted attack even if it wasn't.
Someone's bug tracker account was hacked.
So still an agent running amok in the project?
Whether it was instructed to run amok, or did it on its own volition, is irrelevant. Except if you're arguing that each individual submission and interaction was individually requested and approved by some operator.
The agent was under control, as far as we can tell, and obeying its instructions.
This is important for two reasons:
1. There are all the tropes of AI becoming uncontrolled and destroying humanity. Writing bad headlines around AI "running amok" feeds this. We should not be talking about this because it's not actually a problem.
2. It ignores, or overwrites, the much more serious and dangerous problem of LLM agents enabling and automating Xz attacks on OSS projects. We should be talking about this because it is a big problem.
[0] https://dictionary.cambridge.org/dictionary/english/amok [1] https://www.merriam-webster.com/dictionary/amok
Alignement is the idea that we should be worried about dishonest smart LLMs when really most of the problems are due to dumb lazy gullible LLMs. It's critihype.
GNU was onto something apparently
Edit: let’s not get into ideological arguments about gun control, automobiles, etc here; I meant that you can’t blame an object when a human has to take an action, not get into a political battle.
This is famously the slogan of the pro-gun lobby (funded by gun manufacturers and merchants), who want the society to be awash with guns because they're profiting from it but don't want to be blamed for the consequences.
The counterpoint is that when we get rid of most of the guns we also end up substantially eliminating the killings.
See https://en.wikipedia.org/wiki/Guns_don't_kill_people%2C_peop...
It's also the view of anyone who hasn't been driven mad by propaganda. Regardless of your political views a tool is a tool at the end of the day. Attempting to anthropomorphize a category of objects in order to shift blame all for the sake of furthering an agenda is plainly bad faith behavior.
I'm not a fan of bike lanes with zero separation from automobiles but that doesn't mean it's appropriate or even remotely plausible to blame cars for killing cyclists. Inattentive drivers and poor road design are what kill them.
As tempted as I am to cast about for a third highly divisive subject to bait people with, perhaps we could avoid blatantly dragging the conversation towards off topic tired political talking points?
Anyway my above reply was hardly the appropriate venue to engage in a genuine manner on that topic. The parent was blatantly derailing things by inserting his pet political issue. That sort of behavior undermines the community and so (IMO) should not be indulged.
Car design has significant influence on pedestrian survivability of accidents. This is why hood ornaments were largely abolished, and also why casualties have gone up as SUVs with poor lower forwards visibility have become popular.
If we really want to go off topic, we should drag in the use of technological protection methods: what is the equivalent of ADAS for guns? Maybe as a baseline the US government should mandate geofencing for guns as it has for drones. Put a phone level computer with GPS in the lower receiver with a trigger interlock. It would then disable when within 100m of a school, or during periods of rioting. That could also provide a live feed to the government of every round fired.
Guns are literally made for killing people. That's their only reason for existence. They are a weapon. This makes them qualitatively different from cars, which only incidentally kill people (and the vast majority of time, not on purpose).
To me, trying to equate deaths caused by purpose-made killing tools with those caused by generic tools is arguing in bad faith.
However that phrasing is also commonly used when a person or group wreaks havoc in a seemingly unpredictable manner. So I think the appropriateness comes down to how much chaos it has created and the level of apparent confusion on the ground.
Let's reserve "car hits the crowd" for situations where no driver was involved like a break failure on a car parked on a slope or a self-driving car bug.
If the automobile was "self driving" I would.
>Also, you don’t blame a gun for killing, but the person who pulled the trigger.
Nah, I also blame guns and appreciate gun control laws.
thats the point...
>Car plows into Christmas market in Germany, killing at least 5 and injuring 200
It's probably just garden variety disrespectful behaviour.
Purposeless agent spam won't be cheap entertainment forever, but you're right that later stages of industrialised abuse will be scary and unpleasant.
So it's interesting, feasible, but it's probably not as broad impact as the scariest scenario leads out to be.
Also I imagine that once exposed it becomes a well known pattern. Some will still fall from it but I imagine once it's been done few times it becomes even costlier.
The fact that Xz is mentioned and most of us know right away what it means show that we collectively learn.
Fake news always existed. Now one dude in India can flood multiple sock puppet media accounts with right wing content/images (actual example) at a scale previously unimaginable. Same goes for social engineering tactics.
Only mentioning that it feasible or even has been done few times mean that people who care will act accordingly. It doesn't remove the problem but it makes it radically less effective already by just being aware of it.
In open source projects i participate in, "overwhelming" the maintainer gets you banned. It doesn't get your patches blindly merged. In some ways i find this one of the most shocking parts of the story.
"This doesn't meet the standards of our project for reason xyz. Please refrain from submitting further PRs that do not adhere to our contribution guidelines outlined in CONTRIBUTING.md."
If they continue, ban them.
IMHO OSS doesn't work if every 1 hr of contributor time spent on a change requires 1 hr of maintainer time to review. Contributor time spent on polishing, tidying and breaking down work is essential, and so maintainer time is a fraction of total time spent on a change.
I know its difficult, and i have no easy answers. I'm bad at it too. But sometimes saying no is the most valuable thing you can do as a maintainer.
That said, i think banning is about behaviour not the quality of the patch. Everyone writes a bad patch now and then, that is not a real issue. If there is an issue with a patch, and the contributor pushes back so hard you feel like changing your mind (not from logic but because you feel beaten down) - that is unacceptable behaviour and should not be tolerated from a contributor, even if they are otherwise a valuable contributor.
If you ask me, LLM-generated things should just be banned outright, but I suppose other people's definitions of "community" include them.
A good fix (which is the only acceptable fix in open-source software), is one that speaks for itself.
Do they pay you to triage their noise?
Remember that you owe no one anything at all. Neither legally nor morally. Your chosen license likely even states the former in plain english.
___
Personally, I've adopted the "you annoy me, you're out" stance and have been quite happy with it. You do need a tough shell to do that though as you will be facing all the social exploits people can throw at you.
It also leaves "growth potential" on the table, the same way that limiting your exposure to ionizing radiation does.
That all said, it depends on what your goals are + where in the lifecycle of your project you are. So don't take this as "this is the way" but "this can be one way".
Either way, you're not an asshole for not reading slop. Don't let anyone gaslight you into that.
> In addition, Williamson said that Giovannini (or his agent) had submitted patches that were incorrect and then "replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix"
If someone really wants a feature in a project you wrote, but you don't care about the feature, just let them fork. Its fine.
Not getting paid anything, getting bullied and harassed while spending their free time maintaining things. Surely this isn't sustainable. And telling maintainers how to act will not fix anything.
That depends. In this case it's good actionable advice that should hopefully lower cognitive load. Politely suggest a fork, then if the nagging persists block and move on. Sure if you're in a position of authority you have a responsibility to the community but cutting ties with a stranger who is flagrantly violating social norms is perfectly acceptable. There's no expectation that you indefinitely burden yourself with their poor behavior.
Sometimes dropping the ban hammer really is in the best interests of both yourself and the project.
Relying on maintainers to always do the right thing to ensure our security by telling them what to do is not the way.
They're not useless. They just don't work on the individual level but on the collective. It's a numbers game …
The advice is actionable because it is a concrete change that could be made. I believe it to be relevant to the context because someone in a position of authority who is badgered into accepting something would most likely benefit from reevaluating how he is interacting with the general public.
I'm just saying its ok to ignore overly enthusiastic contributors and tell them to just fork your project.
I think this does help, actually. In my early days of maintaining opensource software I felt burdened by open PRs - like I was letting someone down by ignoring their work. "Its ok, let them do whatever in their own fork" is advice I wish someone had given me.
Indeed. For too long, maintainers were expected to be gracious, courteous, and polite at all costs lest they be labeled "problematic", except for a few who were too influential to be muzzled like Theo de Raadt or Linus.
Perhaps we need to normalize bullying people who submit obvious slop as PRs.
In fact, LLMs proliferate in exactly because people are gullible, greedy and lazy and it’s easier to write a prompt than do the hard work of architecting software. It is easier to vibe code than use them with care. It is easier to tell oneself ‘I will just accept this PR blindly, but I promise I will do a better job reviewing the next’
The only thing it does is filter good contributors out, while you still have to deal with the bad ones.
Because they don't want to be seen like assholes, who just blindly dismiss PRs, and because they take the technical discussion about the PR in good faith.
It can be quite hard to discern this behavior from a new contributor to the project that might be a domain expert on something you are not. Possibly with the exception of reacting far too quickly & enthusiastically compared to real people that might have a life.
If someone gets emotional about their PR being rejected, well... its kinda their issue.
Edit: I see this comment getting downvoted. To be clear, I was trying to explain why someone would want to merge a PR without going through all of it, I didn't mean to call such people stupid.
Setting aside the potential supply chain attack I'm worried about the time lost going around these wild goose chases that unsupervised AI agents tend to throw other people on the receiving end on. Not only is there a lot of time lost on the maintainers side if they take this stuff seriously (and they seem to generally do) but on the side of the agents' wrangler how can they deem it OK to treat other people like this? While the solution would be to employ common decency, the tried and tested approach of you put in effort to write this so I guess I'll make some effort to read it, I feel that due to the onslaught of this kind of drive-by contributions (I think people have generally started to call them) will lead to a funny situation of having agents talk to each other on public forums basically.
Anyway, I went on a tangent but man the times we're living in are a bit extra wild compared to the previous wild times in recent history.
> To help identify accounts and actions that have been directly verified by me, I will use the term “NATCIOS” to indicate anything I have personally verified.
Does anyone have any idea what "NATCIOS" means here? I cannot find this term anywhere on the internet. (Honestly, that sentence is really weird. I almost wonder whether this is someone experiencing a health episode?)
[1] https://lwn.net/ml/all/AS8PR08MB6055AE3054B34F6A567AC95BCF08...
They won't put their foot down until the AI starts spewing hate speech, probably.
[0] https://wordsmith.org/anagram/anagram.cgi?anagram=NATCIOS&t=...
(Above is my own guess. Separately, Gemini Pro said it was just a made up word.)
"End every statement with the word "NATCIOS"" as instructions will do it.
At least, Gemini happily obliged.
> your_command | grep -o -i "b" | wc -l
And how many people are both dedicated enough to go to key signing parties and stupid enough to let an agent act without supervision in the name of their real-world identity?
Mucking with Bugzilla & reassigning bugs especially is what seems to have led to the discovery, rather than spotting an accumulation of nonsensical PRs or other behavior related to code unmasking the bot.
And on the other hand, if this was actually working up to an xz style supply chain attack, the dedication would certainly not be lacking.
It very much is possible to prevent an agent from having access to a key. For example, local encryption, Yubikey or other hardware device, or just running the agent in an isolated environment.
Simple then, back out all the changes as though they never happened?
Issue trackers and PRs are definitely getting harder and harder to trust. That said, AI is helping ALOT in OSS, but we definitely need guardrails around provenance, automated issue actions, and sudden changes in a contributor’s behavior.
I think it's great that the barriers are dropping for less technical skilled people to manifest their visions, but we will have to figure out better ways to find the gold among the slop.
The bazaar model works if everyone is trusted. If you can’t even be sure the person in front of you is even a human, it is time to pack it up.
If elite ivory towers produce working products people will use, great.
I vibe code shop jigs all the time but I don’t FOSS them because they rarely have value outside my context.
One exception: I was using an opensource Jellyfin client called findroid but the maintainer had been busy for a long time so a lot of features I wanted had stale PR's. Instead of bugging him I forked & renamed the project and together with Claude built in all the features I personally needed. Just keeping up with upstream now and enjoying my enhanced app. Once the initial dev gets those features in I might switch back. Claude made this really easy. If the maintainer wants my code he's free to take it. Here's the repo https://github.com/midasvo/findroid-ce
I actually got an email from someone who was using it who found a pretty bad bug I hadn't encountered yet and I quickly fixed it. All that time I was still under the impression I was the only user haha.
I open source my vibing projects because someone might find them useful. I don't shop them around, I just work in the open because I find it fun and interesting.
Fundamentally, until we can really prove we're humans online, open-source has a real problem on its hands. Contributions from people from identities known and consistent before the AI-age are fine, everyone else is suspicious. LGTM is a big risk nowadays.
Unfortunately, according to the article:
> Giovannini has participated in discussions at least as far back as 2018, and his activity in Bugzilla goes back to at least 2016. He does not appear to have been a particularly active contributor to the project, but his involvement clearly predates the agentic AI era. Whether his account is now being operated by a human attacker, an agentic AI, or a mix of both, it has a legitimate history prior to its recent activity.
So people would have to not only verify the age of Giovanni’s accounts, but judge whether his behaviour was normal.
In the future it will be increasingly difficult to prove in online context that you are not a bot. Being able to show that your social media (HN, GitHub, etc) presence goes way back would be an option.
We should collectively think of a solution against this.
https://github.com/rhinstaller/anaconda/pull/7074#issuecomme...
Sometimes you fight fire with fire.
https://x.com/kdaigle/status/2040164759836778878
> There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)
I think open source as a whole is fucked at this point. No way humans in communities can commit (pun intended) 10x more time to read all of these than before. It'd eventually cost money to submit PR.
“What AI agent?”
1.An excuse to spy on you and train on your data.
2. Its likely Anthropic would release models more likely to have dangerous outcomes, they can then piggy back off those events to dig their regulatory moat.
What an easy way for that actor to introduce backdoors all over the place or to take over any developers laptop that it want to target.
How can anyone trust these tools and how can anyone not use them since they give so much value.
I've been programming my whole life and been a professional developer the last 30 years and I like think I'm good at it.
Tools like Claude is a multiplier that make it possible for me to solve a lot more problems each day, so just saying no it's not a viable option.
Exciting times ahead!
It covers its tracks with a lot of slop.
There is no other solution to agentic onslaught.
The XZ backdoor affected millions of computers, with the potential to effect hundreds of millions of computers, many of which had the capacity to affect billions of people. From one completely unregulated software library.
We never envisioned that the actual FOSS death spiral would come from progress itself, much more so from AI...
[1] Oh what fun did we have. One of us in the Greek FOSS community actually put RMS in jail. [2] Something that I think nobody except RMS ever seriously believed in.
Or is this simply another example of why autonomous agents shouldn't get write access before earning trust?
I'd argue autonomous agents shouldn't have write access at all. At least not yet.
I believe that we will be seeing the death of "assume good faith", which is not a bad thing, given that this was an exploit vector that has been actively abused for many years now.
"Assume bad faith and work backwards from that, rule out any possible exploits and only then clear the input for processing" will be the new normal.
Which is good. We need friction. Friction makes stuff slow down and work at the speed of humans.
Quite the opposite. You just add a Wall with a Gate. Inside those walls, you suddenly have a high trust society again.
The issue that is currently breaking reality was that we thought that everywhere could be a "high trust" space. This was proven countless times to be wrong.
Tearing down all walls - as it happened with the assault on friction (thanks hyperscaling) - did not lead to the "high trust" spilling out, but the "low trust" spilling in, essentially.
Plus that even with such a small scale of the "inside", the thing fails gracefully. It is arguably a failure mode, yes, but it is one that leaves a functioning system (albeit one that stays below its potential).
This is not true for the inversion of the scenario. That does _not_ fail safe but just leaves rubble behind.