Hacker News new | past | comments | ask | show | jobs | submit
Well sure they didn't do the dance, but you don't have to do the dance. The reason to do it is that it's a good defense in a lawsuit. Like you say, all of this is a legal minefield.

So my understanding was that the original code was specifically not fed into Claude. But was almost certainly part of its training data, which complicates things, but if that's fair use then it's not relevant? If training's not fair use and taints the output, then new-chardet is a derivative of a lot of things, not just old-chardet...

This is all new legal ground. I'm not sure if anyone will go to court over chardet, though, but something that's an actual money-maker or an FSF flagship project like readline, on the other hand, well that's a lot more likely.

Strong agree on it all being a legal minefield / new grass.

> But was almost certainly part of its training data, which complicates things

On this point specifically, my read of the Anthropic lawsuit was one of the precedents was that if it trains on something but does not regurgitate it, its fair use? Might help the argument that it was clean-room but ¯\_(ツ)_/¯