Finding Fair Use for GAI Training is Highly Problematic

Although I have expressed aspects of these views in several posts over the past couple of years, I will try to consolidate my opinion as to why GAI training with protected creative works is a more problematic fair use consideration than many, even the courts, seem to believe. I acknowledge that even fellow copyright advocates will disagree with some of this analysis, but here it goes:

For the sake of narrowing the focus to the question of whether training generative AI (GAI) with protected works favors a fair use exception, the following assumes that the training requires unlicensed copying of protected expression. Further, even if the GAI maker limits the product’s capacity to output infringing copies, this does not alter the fact that considering fair use for this purpose is, at best, troubling and, at worst, so disturbing to case law that the AI developers are begging the courts to articulate doctrine out of whole cloth.

**A GAI’s Purpose is Not Analogous to Past Fair Use Factor One Findings**

The courts have largely rejected the overbroad opinion that making “something new” is a sufficient justification for unlicensed use of protected works. Thus, it is difficult to see where any court finds an authority to support the argument that making a “creator robot,” however revolutionary its developers proclaim it to be, is a transformative purpose under a factor one analysis.

Typically, a GAI’s purpose neither expresses “critical bearing” on the works used (AWF v. Warhol) nor provides information about the works to human readers (Authors Guild v. Google) nor fosters interoperability in computer devices (Google v. Oracle). Instead, a GAI’s most widely applied and widely promoted purpose is artificial “authorship” without authors—a purpose which forecasts myriad negative effects that may prove to dramatically overwhelm any benefits promised by the developers.

Naturally, certain GAIs (e.g., ChatGPT) can be used for various purposes, about which more below, but if the courts are distracted by the sheer novelty, scope, and hype around the “importance” of GAI and, therefore, presume transformativeness, they may be persuaded to articulate a rationale that would be tantamount to a blanket exception for GAI training. If the court adopts this carve-out in the context of fair use factor one, the result would be a reversal of its own reluctance to favor the broad “something new” argument for transformativeness that so recently rejected in Warhol.

Notably, it is not unprecedented for the court to articulate rationales beyond the four-factor analysis. In the Google Books case, the court found that the search tool provides a “social benefit,” and a similar sentiment was articulated in Google v. Oracle regarding consumer benefit in advancing mobile products. Or looking back at the Betamax VCR case, the concept of “time shifting” the viewing schedule served the public interest by expanding flexibility in the consumption of copyrighted material that was lawfully obtained.

But if the courts look for a rationale beyond the case law (e.g., a clear social benefit of GAI), not only will they be making a wild guess, but any conclusion in favor of the developers will probably be wrong—perhaps dangerously so. While it is understandable that the courts may be reluctant to hobble technological development in principle, the available facts militate against disturbing fair use jurisprudence for the sake of nurturing GAI in general.

Put differently, if the courts are going to take a wait-and-see approach, there is ample evidence that GAIs already cause harm to individuals—from CSAM and defamation to cheating and psychological issues—to say nothing of the well-founded anxieties—social, political, economic, and environmental—associated with this multi-trillion-dollar gamble being played by the same people who unrepentantly accrued wealth and power from the darkest results of Web 2.0.

GAI as a Tool for Creators

To the extent that a given GAI product may be considered a tool for producing creative works, a fair use holding should at least find that the tool “promotes the progress” of authorship with respect to copyright’s purpose. But this is difficult because the same GAI in the hands of one skilled creator offers little insight about its ultimate purpose in the hands of 100-million unskilled users.

At the positive end of considering GAI’s purpose, my friend David Bolinsky, a medical illustrator and animator, recently made a series of 8 dozen topically and stylistically distinct ten second animations, introducing speakers and segment topics for a scientific conference, a daunting assignment. GAI collapsed well over a year of work (if using his standard 3D animation tools) into a matter of weeks. He was surprised at the breadth and depth of creative latitude GAI enabled. Further, he explained that although these presentations allowed more creativity than his typical discrete medical and scientific educational animations, an amateur lacking his experience still could not have used the same GAI tools to achieve the same results. Consequently, Bolinsky sees GAI as an opportunity to do more and different kinds of work and not as a threat to his creativity or livelihood.

In this example, the technology is socially beneficial and arguably “promoting the progress” of authorship, which may favor a finding that the tool is transformative. That said, due to the human authorship requirement, we are years away from guidance as to the degree of copyright protection on those animations; and if GAI tools are used to produce millions of works that have no “authors” as a matter of law, it is contrary to find that this “promotes progress” in regard to copyright’s purpose.

Further, the difficulty for the court in considering fair use is that Bolinsky and his colleagues who specialize in medical work are unique among professional creators, to say nothing of the many millions of non-creator customers that GAI developers need—because they are leveraged into the stratosphere—to make their products profitable. This scale implies an analysis reminiscent of Sony—i.e., a question of whether the purpose of the GAI is substantially beneficial or substantially harmful. But knowing that requires time travel.

If a court could see a few years into the future and find, for instance, that the GAI at issue will be used substantially for nonconsensual pornography, disinformation, and scams, it would presumably decline to find these purposes are social benefits that favor an expansive transformativeness finding. Instead, at the moment, the courts simply have no idea what the true “purposes” are of various GAIs, which is unprecedented in fair use jurisprudence. The VTR, Google Books, Android phones, et al. did not serve materially different purposes years after they were presented to the courts in their respective cases. By contrast, GAIs present an incomplete and dynamic set of facts; and in my view, this alone should militate against finding that factor one favors any of these products.

The Threat to Authorship Itself

As stated in other posts and in comments to the Copyright Office, one unique challenge of GAI is that it poses a potential threat to authorship (i.e., that it will shrink the number of creative workers), which is clearly destructive to the progress clause and copyright law. Although my own view is that a party who poses an existential threat to copyright’s purpose should not be allowed to invoke one of copyright law’s affirmative defenses, I recognize the difficulty in that opinion.

Under U.S. law, copyright protects authors indirectly by protecting certain exclusive rights to use their works. Consequently, there is little foundation for arguing generalized harm to authorship itself, despite the overwhelming recognition that diversity in authorship has benefitted the United States both culturally and economically for almost two centuries. In this context, GAI provokes the question as to whether U.S. policy might shift toward a “moral rights” approach akin to Europe, but that’s a discussion for a different post.

Instead, the general threat to authorship is considered, to an extent, under fair use factor four, which weighs the potential threat to the market value of the works used. The key difficulty, however, is that if the GAI does not output the song “Ordinary” but instead outputs music in the style of Alex Warren, then the output is not, strictly speaking, a threat to the market value of “Ordinary” itself. While proposals like the NO FAKES Act would prohibit unauthorized replication of Warren’s voice, copyright law does not clearly prevent a GAI that makes Warren-like music that could theoretically obviate the need for Warren himself.[1]

For now, several plaintiffs in the roughly 40 active lawsuits GAI developers have presented evidence of outputs that are substantially similar to the works used in training, and this should disfavor fair use for the GAI developers under factor four. More broadly, plaintiffs in these cases argue that licensing works for the purpose of AI training is itself a market opportunity exclusive to the copyright owner, and therefore, the failure to license constitutes market harm under factor four.

Some courts may be reluctant to agree with the lost licensing opportunity claim, but that reluctance is unfounded—even if a developer successfully prevents its product from outputting copies of works used in training. So long as one of the exclusive copyright rights is implicated (and here, it would be the reproduction right), then a requirement to license exists. Consequently, failure to license, especially at such an extraordinary scale for unprecedented commercial venture, is unquestionably market harm to the copyright owner.

Even where there may be a close call on factor four, because the GAI developer should lose on factor one, and because factors two and three decidedly favor creator plaintiffs, factor four should not reasonably control in many of these cases. Moreover, the courts should pay scant attention to the claim by developers that the cost of licensing is existentially prohibitive to the development of GAI.

In addition to the fact that this plea is barely tolerable from parties wildly spending billions on high-risk ventures, any claim that a license is “too costly” for any venture is no defense under copyright law. The copyright owner sets the terms for the use of her work, and the prospective user can accept those terms or not before using the work. If that rule applies to the bootstrapping indie filmmaker, surely it applies to Microsoft, Meta, Google, et al.

Conclusion

Fair use is a mixed question of fact and law, and I maintain that what should be most fatal to the developers’ fair use defense is that, like the public, the courts have insufficient facts about the ultimate purpose of GAI products. Just as with Web 2.0 in the late 1990s, we are witnessing unfounded political sentiment to once again let Big Tech do what it wants, preaching to the public that this time, the technology really will “solve the world’s problems.”

Of course, there is no rational basis for that belief beyond the self-interest of the developers and the investors losing billions every year. If past is prologue, Congress would live to regret the folly of allowing AI to run amok, just as Members of both parties now rue the unconditioned immunity of Section 230. In the meantime, while licensing copyrighted works for GAI training will not address all, or most, of the potential hazards of artificial intelligence, the courts should decline to adopt strained fair use rationales in the name of assumed progress that may turn out to be a complete disaster.

[1] I believe there are cultural reasons that militate against this result, but those predictions do not influence the fair use consideration.