So my understanding was that the original code was specifically not fed into Claude. But was almost certainly part of its training data, which complicates things, but if that's fair use then it's not relevant? If training's not fair use and taints the output, then new-chardet is a derivative of a lot of things, not just old-chardet...
This is all new legal ground. I'm not sure if anyone will go to court over chardet, though, but something that's an actual money-maker or an FSF flagship project like readline, on the other hand, well that's a lot more likely.
> But was almost certainly part of its training data, which complicates things
On this point specifically, my read of the Anthropic lawsuit was one of the precedents was that if it trains on something but does not regurgitate it, its fair use? Might help the argument that it was clean-room but ¯\_(ツ)_/¯