Story Detail of id 47451192 | Liveview Hacker News

mjg5914 hours ago | on: FSF statement on copyright infringement lawsuit Bartz v. Anthropic

Where's the threat? The FSF was notified that as part of the settlement in Bartz v. Anthropic they were potentially entitled to money, but in this case the works in question were released under a license that allowed free duplication and distribution so no harm was caused. There's then a note that if the FSF had been involved in such a suit they'd insist on any settlement requiring that the trained model be released under a free license. But they weren't, and they're not.

(Edit: In the event of it being changed to match the actual article title, the current subject line for this thread is " FSF Threatens Anthropic over Infringed Copyright: Share Your LLMs Freel")

teiferer13 hours ago | parent | next

> but in this case the works in question were released under a license that allowed free duplication and distribution so no harm was caused.

FSF licenses contain attribution and copyleft clauses. It's "do whatever you want with it provided that you X, Y and Z". Just taking the first part without the second part is a breach of the license.

It's like renting a car without paying and then claiming "well you said I can drive around with it for the rest of the day, so where is the harm?" while conveniently ignoring the payment clause.

You maybe confusing this with a "public domain" license.

mjg5912 hours ago | root | parent | next

If what you do with a copyrighted work is covered by fair use it doesn't matter what the license says - you can do it anyway. The GFDL imposes restrictions on distribution, not copying, so merely downloading a copy imposes no obligation on you and so isn't a copyright infringement either.

I used to be on the FSF board of directors. I have provided legal testimony regarding copyleft licenses. I am excruciatingly aware of the difference between a copyleft license and the public domain.

danlitt11 hours ago | root | parent | next

> I am excruciatingly aware of the difference between a copyleft license and the public domain.

Then why did you say "no harm was caused"? Clearly the harm of "using our copylefted work to create proprietary software" was caused. Do you just mean economic harm? If so, I think that's where the parent comments confusion originates.

loading story #47456353

friendzis10 hours ago | root | parent | next

> The GFDL imposes restrictions on distribution, not copying, so merely downloading a copy imposes no obligation on you and so isn't a copyright infringement either.

The restrictions fall not only on verbatim distribution, but derivative works too. I am not aware whether model outputs are settled to be or not to be (hehe) derivative works in a court of law, but that question is at the vey least very much valid.

mcherm9 hours ago | root | parent | next

It's the third sentence of the article:

> the district court ruled that using the books to train LLMs was fair use but left for trial the question of whether downloading them for this purpose was legal.

friendzis9 hours ago | root | parent

No, those are separate issues.

The pipeline is something like: download material -> store material -> train models on material -> store models trained on material -> serve output generated from models.

These questions focus on the inputs to the model training, the question I have raised focuses on the outputs of the model. If [certain] outputs are considered derivative works of input material, then we have a cascade of questions which parts of the pipeline are covered by the license requirements. Even if any of the upstream parts of this simplified pipeline are considered legal, it does not imply that that the rest of the pipeline is compliant.

loading story #47454125

loading story #47457970

snovv_crash11 hours ago | root | parent | next

Models, however, can reproduce copyleft code verbatim, and are being redistributed. Doesn't that count?

Licences like AGPL also don't have redistribution as their only restriction.

shagie6 hours ago | root | parent

Stack Overflow has verbatim copied GPL code in some of its questions and answers. As presented by SO, that code is not under the GPL license (this also applies to other licenses - the BSD advertising clause and the original json will cause similar problems).

Arguably, the use of the code in the Stack Overflow question and answer is fair use.

The problem occurs not when someone reads the Q&A with the improperly licensed code but rather when they then copy that code verbatim into their own non GPL product and distribute that without adherence to the GPL.

It's the last step - some human distributing the improperly licensed software that is the violation of the GPL.

This same chain of what is allowed and what is not is equally applicable to LLMs. Providing examples from GPL licensed material to answer a question isn't a license violation. The human copying that code (from any source) and pasting it into their own software is a license violation.

---

Some while back I had a discussion with a Swiss developer about the indefinite article used before "hobbit" in a text game. They used "an hobbit" and in the discussion of fixing it, I quoted the first line of The Hobbit. "In a hole in the ground there lived a hobbit." That cleared it up and my use of it in that (and this) discussion is fair use.

If someone listening to that conversation (or reading this one) thought that the bit that I quoted would be great on a T-shirt and them printed that up and distributed it - that would be a copyright violation.

Google's use of thumbnails for images was found to be fair use. https://en.wikipedia.org/wiki/Perfect_10,_Inc._v._Amazon.com...

    The Ninth Circuit did, however, overturn the district court's decision that Google's thumbnail images were unauthorized and infringing copies of Perfect 10's original images. Google claimed that these images constituted fair use, and the circuit court agreed. This was because they were "highly transformative."

If I was to then take those thumbnails from a google image search and distribute that as an icon library, I would then be guilty of copyright infringement.

I believe that Stack Overflow, Google Images, and LLM models and their output constitutes an example of transformative fair use. What someone does with that output is where copyright infringement happens.

My claim isn't that AI vendors are blameless but rather that in the issue of copyright and license adherence it is the human in the process that is the one who has agency and needs to follow copyright (and for AI agents that were unleashed without oversight, it is the human that spun them up or unleashed them).

piker10 hours ago | root | parent | next

That's really interesting. I'm a lawyer, and I had always interpreted the license like a ToS between the developers. That (in my mind) meant that the license could impose arbitrary limitations above the default common law and statutory rules and that once you touched the code you were pregnant with those limitations, but this does make sense. TIL. So, thanks.

loading story #47453306

dataflow5 hours ago | root | parent | next

Unrelated question regarding this part, since you seem to be an expert on this:

> If what you do with a copyrighted work is covered by fair use it doesn't matter what the license says - you can do it anyway.

How is it that contracts can prohibit trial by jury but they can't ban prohibit fair use of copyrighted work? Is there a list of things a contract is and isn't allows to prohibit, and explanations/reasons for them?

loading story #47456485

materialpoint10 hours ago | root | parent | next

This means that you can ignore any part of licenses you don't want to and just copy any software you want, non-free software included.

mjg594 hours ago | root | parent | next

No. The GFDL grants you permission to copy the work.

mikkupikku10 hours ago | root | parent

This is in fact how I operate.

thayne6 hours ago | root | parent | next

But fair use is dependent on you getting the work legally. Is downloading a book with the intention of violating the GFDL a legal way of acquiring it.

6 hours ago | root | parent

{"deleted":true,"id":47454739,"parent":47451924,"time":1774015652,"type":"comment"}

jcul13 hours ago | root | parent | next

This article is talking about a book though, not software.

"Sam Williams and Richard Stallman's Free as in freedom: Richard Stallman's crusade for free software"

"GNU Free Documentation License (GNU FDL). This is a free license allowing use of the work for any purpose without payment."

I'm not familiar with this license or how it compares to their software licenses, but it sounds closer to a public domain license.

loading story #47451712

loading story #47451772

ghighi787810 hours ago | root | parent | next

Telling mjg59 they are confused about a license is an audacious move. But I understand your question and I have the same question.

Dylan1680713 hours ago | root | parent

They don't need the "do whatever" permission if everything they do is fair use. They only need the downloading permission, and it's free to download.

darkwater11 hours ago | parent | next

I don't like the editorialized title either but I would say that the actual post title

"The FSF doesn't usually sue for copyright infringement, but when we do, we settle for freedom"

and this sentence at the end

" We are a small organization with limited resources and we have to pick our battles, but if the FSF were to participate in a lawsuit such as Bartz v. Anthropic and find our copyright and license violated, we would certainly request user freedom as compensation."

could be seen as "threatening".

lelanthran13 hours ago | parent | next

It's just an indication to model trainers that they should take care to omit FSF software from training.

Not a nothing burger, but not totally insignificant either.

loading story #47451369

eschaton13 hours ago | parent

[flagged]

loading story #47451401

loading story #47451418

#visit	13,197,521
#session	74,665
#live-session	0