When I want to. I like to describe it using the amusing language from a generic cardholder agreement.
At any time, at my sole discretion, I may ban you from any of my projects; for any reason, or for no reason at all.
My projects exist because I enjoy working on them. My continued enjoyment is the most important aspect to the health and survival of any project. You don't owe anyone anything, you're allowed to donate your work to others, and also enjoy the privilege of setting whatever arbitrary rules you want to make sure you enjoy your time.
Imagine you're running a free ice cream shop. Some random asshole walks in and starts verbally abusing your best employee who has done nothing but try to help. At what point do you kick them out because your employee is more important and worth more.
You should stick up for yourself, I would.
You can't be an asshole to an LLM. They can feel offended.
Would I like it to be merged? Sure would, it would stroke my ego, and I would not have to deal with any merge conflicts with whatever else they're cooking up. Does that mean they must merge it? Sure doesn't. They didn't make me any promises. For the time being, I can just use my fork.
Many open-source projects aren't passion projects run for pleasure. Think of it more like ice cream shops sharing recipes, or sharing in the work of running the factory. They just can't kick people out willy-nilly.
IMHO OSS doesn't work if every 1 hr of contributor time spent on a change requires 1 hr of maintainer time to review. Contributor time spent on polishing, tidying and breaking down work is essential, and so maintainer time is a fraction of total time spent on a change.
"This doesn't meet the standards of our project for reason xyz. Please refrain from submitting further PRs that do not adhere to our contribution guidelines outlined in CONTRIBUTING.md."
If they continue, ban them.
Unfortunately, I see the choice space here as having "developer effort" anti-correlated with "negative repercussions".
On one end of the distribution, a "hair trigger ban" strategy is low-effort for the developer but will have some fraction of false positives and some fraction of those impacted will complain to "the socials" and some fraction of those complaints will gain traction and, as we have seen, can unfairly taint the project or worse. Responding and managing the false positives also requires developer effort, unless the developers can sustain a "fsck the haters" attitude.
On the other end of the distribution, the developer can spends substantial effort to engage each submitter to ascertain and correct bad behavior, educate them on how they should engage other humans as a fellow human in this LLM era.
There is developer effort needed of different types along this distribution.
A divide-and-conquer strategy might go something like this:
- Rank each submission in some low dimension space (llm<-->human, malicious<-->helpful)
- When enough samples are collected, perform clustering in this space to determine stereotypes, name these clusters, and develop mitigating strategies and implementations as needed.
Mitigations from easy/extreme to hard/accommodating could include:
- Hair trigger ban button.
- Copy-paste a link to an explanation in a comment before closing and/or banning.
- Customized explanation in comment before closing and/or banning.
- Link or customized explanation of what must be done to move the sample to a more favorable category and close/ban if resistance or silence is returned.
- Ongoing engagement in the face of resistance or silence.
This "meta development" program to provide such a system/facility could of course be highly automated with LLMs, fighting fire with fire.
(Despite the length of this reply, it was written entirely by a random human on the internet and not an LLM).
Which is to say, your system sounds good but I expect much more complicated defenses are needed.
I know its difficult, and i have no easy answers. I'm bad at it too. But sometimes saying no is the most valuable thing you can do as a maintainer.
That said, i think banning is about behaviour not the quality of the patch. Everyone writes a bad patch now and then, that is not a real issue. If there is an issue with a patch, and the contributor pushes back so hard you feel like changing your mind (not from logic but because you feel beaten down) - that is unacceptable behaviour and should not be tolerated from a contributor, even if they are otherwise a valuable contributor.
(Simpler to say than practice fwiw)
If you ask me, LLM-generated things should just be banned outright, but I suppose other people's definitions of "community" include them.
Why? In the end it's a patch's quality that counts. Regardless who or what contributed it.
Bad patch from trusted contributor is still a bad patch.
Perhaps this is more a management problem. How to best use developer's time, where to use AI (vs blindly deploy AI to generate patches & swamp developers with that).
Or do some rate-limiting? "Sorry, we accept no more than 10KB worth of patches per week on this project! Try again next week after we've reviewed this week's batch".
LLM patches tend to be significantly harder to review. Mostly because LLMs let people who don't know what they are doing get much further.
It might be an unfair heurestic as there are plenty of competent people who use it to good effect, but the vast majority of negative value patches use LLMs and it can be a bit exhausting. Lowering the technical barriers of entry just means more pressure on the human ones.
You just said: The things that I think and care about matter more than the things that you care about.
is that what you meant?
Being honest, if we're talking about the health of any given project, the patch quality doesn't matter that much. Not when you measure it against the importance of consistency and continuity of a regular contributor. A thousand perfect LLM patches are less valuable than an experienced maintainer.
If your LLM is annoying them, and they quit. The perfect LLM patch just destroyed the repo.
People wasting others time is a social problem, not a technical one. Rate limits can't prevent somebody feeling disrespected.
A good fix (which is the only acceptable fix in open-source software), is one that speaks for itself.
I disagree. Often if I'm making a PR to an open-source project I'm doing so because I have a use-case that the original author hadn't considered. So the first step in getting the PR merged is explaining my point of view and convincing the maintainer that my use-case is valid. Only when this is done can the "goodness" of the patch be evaluated.
Do they pay you to triage their noise?
Remember that you owe no one anything at all. Neither legally nor morally. Your chosen license likely even states the former in plain english.
___
Personally, I've adopted the "you annoy me, you're out" stance and have been quite happy with it. You do need a tough shell to do that though as you will be facing all the social exploits people can throw at you.
It also leaves "growth potential" on the table, the same way that limiting your exposure to ionizing radiation does.
That all said, it depends on what your goals are + where in the lifecycle of your project you are. So don't take this as "this is the way" but "this can be one way".
Either way, you're not an asshole for not reading slop. Don't let anyone gaslight you into that.
I'm reminded of Zig, where a stated goal is to encourage human programmers to get involved so they learn more about coding… as compared with 'get involved to make Zig itself more fully developed at its more abstract goals'. If a primary purpose is to get human minds coding, that rules out the whole class of 'encourage human minds to prompt machines to do the coding instead'. Zig is not trying to teach people to be managers, and that's both legitimate and charming :)
When you say "yes", the worst thing that can happen is you destroy your project and the trust of every user.
If you're not sure, say no.