AI Works Do Not “Compete” with Works of Authorship

"compete"

Many arguments advocating the view that AI training does not conflict with copyright rights  share a common fallacy, namely that AI outputs represent “competitive” works that copyright law was intended to promote. This error appears in Judge Alsup’s opinion in Bartz et al. v. Anthropic AI, in a report published by AI Progress, and in an amicus brief filed by three law professors in Thomson Reuters v. Ross Intelligence.

The competition fallacy rejects the notion of “market dilution,” which may be a novel, but not unfounded, consideration under factor four of the fair use analysis. Traditionally, the fourth factor inquiry considers whether the particular use of the work(s) in suit might potentially harm its/their market value. The question does not ordinarily weigh harm to, say, all sound recordings by virtue of having scraped all sound recordings to produce a machine that makes different sound recordings. Because the dilution principle would strongly disfavor AI developers, its proponents seek to portray the outputs as “competitive” works envisioned by copyright law.

As a threshold principle, although authors may be said to be in “perfect competition” or non-competition with one another, copyright’s purpose is not to promote competition but to promote as much diverse expression as authors may be inspired to create. Notwithstanding the use of AI as tools of human expression, it is an error to refer to AI outputs in general as “works of expression,” “works of authorship,” or any term of art that seeks to portray purely machine-made outputs as an intended consequence of copyright.

The inapt use of these terms perhaps indicates a hope that courts won’t notice the omission of the human authorship doctrine. But so long as that doctrine is affirmed (and it should be), we should only refer to AI outputs by other terms—choose the pejorative “slop” or the neutral “material” as you wish—in order to place outputs in proper context to copyright law. As argued here several times, if the material at issue is not protected by copyright on the basis that it is not made by a human, then its existence cannot be described as a “work” incentivized by copyright.

Judge Alsup’s Error in Bartz et al. v. Anthropic AI

Although the Bartz case itself is settled and will not be appealed, the reference to “competition” made by Judge Alsup will probably be litigated again in one or more of the many active AI training lawsuits. In his opinion, he wrote…

…Authors’ complaint is no different than it would be if they complained that training schoolchildren to write well would result in an explosion of competing works. This is not the kind of competitive or creative displacement that concerns the Copyright Act.

In addition to buying into the anthropomorphic comparison between machine learning and human education, Judge Alsup’s hypothetical “explosion of competing works” set off an explosion of criticism, including by Judge Chhabria of the same circuit, ruling in Kadrey et al. v. Meta. His response states…

…when it comes to market effects, using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.

I agree with this critique though, even here, would prefer not to see the word “competing.” Competition is generally creative whereas market dilution is generally destructive and closer to describing GAI’s effect on works of authorship and on copyright law. In fact, Judge Chhabria opines in Kadrey that, “As for the potentially winning argument—that Meta has copied their works to create a product that will likely flood the market with similar works, causing market dilution—the plaintiffs barely give this issue lip service.” This kind of signal that the market dilution theory has legal foundation is why I believe its critics rely on the competition fallacy.

The Report by AI Progress

The report titled AI Models: Addressing Misconceptions About Training and Copyright, written by Anna Chauvet and Karthik Kumar, PhD, engages in the competition fallacy, albeit in a context I tend to find baffling. I say this because the report first presents an in-depth technical argument as to why AI training does not entail infringing conduct but then devotes equal effort arguing that model training is fair use.

If this document were a legal response in court, not presenting a fair use defense would likely be malpractice, but as an experts’ report, the fair use discussion casts doubt on the scientific rationale for non-infringement. Where there is truly no basis for infringement, there is no reason to mention fair use. Yet, in rejecting a consideration of market dilution under factor four, the authors of the report reprise the competition fallacy thus:

If a new work does not use protected expression, it does not matter whether it competes in the same genre and market as prior works. An increase in competitive creative works is precisely the growth of creative expression that the Copyright Act was intended to promote.

Notably, the authors rely on traditional fourth factor jurisprudence in the first sentence but seek to foreclose any consideration of AI’s novelty by mischaracterizing its outputs in the second sentence. The authors err by referring to the mass outputs of a GAI as “creative works” at all, let alone as the type of works intended to be promoted by the Copyright Act. As stated in an earlier post, I believe the courts should recognize that GAI lacks any technological precedent and, therefore, should not demur to plow new ground in considering market dilution as a destructive consequence worthy of deep consideration.

Further, it is concerning when any party implies that the AI outputs do not matter in considering whether the training process is fair use. This is nonsensical and inconsistent with case law. The courts absolutely consider the specific utility of technologies that potentially infringe copyright rights, and it is impossible to weigh the purpose or market effect of an AI product without considering its outputs. After all, the outputs are its purpose.

The Professors’ Brief in Thomson Reuters v. Ross

Law professors Brian L. Frye, Jess Miers, and Mateusz Blaszczyk filed a brief in Thomson Reuters v. Ross, principally to argue that the headnotes copied from Westlaw are not properly subjects of copyright. Here, I will set that question aside, and frankly, whether the courts find the headnotes to be sufficiently original for protection is not particularly relevant to the challenges posed by AI.

In the latter part of the brief, though, the professors reprise the competition fallacy, stating, “The problem with the dilution theory is that producing similar, but noninfringing works is precisely the kind of competition copyright is supposed to promote.” Again, this statement is legally correct but factually misleading. If the professors want to argue, as they do, that the Federal Trade Commission et al. err by advancing a market dilution theory based on unfair competition law, perhaps that debate is worth having. But general statements that AI outputs, as non-works of authorship, inherently fulfill the intent of copyright law are flatly wrong. The brief continues…

The Act seeks to promote the creation of original works of authorship, not to protect authors against competition. Indeed, it is axiomatic that the purpose of copyright is to benefit the public by encouraging marginal authors to produce and distribute additional works of authorship.

Copyright does not protect authors against informal competition with one another, but as stated, that has nothing to do with “competing” with machines that output non-works by non-authors. As for the reference to marginal authors, this is both misstated and misguided. First, the Copyright Act is agnostic as to which authors become popular and which ones remain “marginal.” Second, as is always the case, it is the independent authors who are more likely to be marginalized into oblivion by unregulated, unethical, and unlicensed AI products.

There are several briefs filed in Thompson Reuters by many familiar names in anti-copyright circles, and no doubt, they all repeat some variation on the competition fallacy. But copyright law exists to incentivize human beings to devote time, talent, and energy to the production and distribution of creative and informative works. Copyright does not exist to mass-produce material, content, slop, or stuff by any other name that lacks creative expression by humans.

Mistakenly portraying the outputs of GAI as generally “competitive” with works of authorship produces a cascade of doctrinal errors that swirl in eddies of circular logic around the pillar of the fourth fair use factor. The courts should decline to be dragged into that vortex and, as Judge Chhabria at least implied, they should be willing to consider the diluted streams of creativity that can result from wanton use of AI.


Photo by Fizkes

Public Knowledge Post on AI & Fair Use Misses the Mark

fair use

Patrick Gallaher at Public Knowledge recently posted an article about AI training with protected works, proposing to distinguish between piracy and fair use. Not to begin on a pedantic note, but the article is subtitled “Words Matter” because it claims that piracy is a provocative, non-legal term, so I have to respond by saying this is wrong. Although we think of “piracy” today as enterprises like The Pirate Bay, courts have often used the term “piracy” to mean “copyright infringement.” For instance, the seminal fair use case Folsom v. March (1841) uses the word thirteen times as in this quote:

“….it is as clear, that if he thus cites the most important parts of the work, with a view, not to criticise, but to supersede the use of the original work, and substitute the review for it, such a use will be deemed in law a piracy.”

So, Gallaher is making a semantic fuss over nothing. If a contemporary court holds that AI training with protected works is copyright infringement, then this conduct may both legally and colloquially be called piracy.

As to the substance of the post, Gallaher asserts that AI training is inherently fair use, which is too broad a claim. The fair use doctrine defies generalization, and the facts in one case involving a particular AI and one type of work may have limited influence on the result of a case involving a different AI and different type of work. Or to put that another way, the incomplete fair use inquiry conducted in Bartz v. Anthropic, involving a class of literary works, likely predicts almost nothing about the eventual outcome in UMG et al. v. Udio or Disney et al. v. Midjourney, involving sound recordings and visual works respectively.

Gallaher states that AI training is transformative under fair use factor one (the purpose of the use). Indeed many articles of this nature rely on the assumption that this finding should be obvious and should carry the weight of the fair use analysis. “Copying for training is transformative: it uses the works for a fundamentally different purpose from the original, much like indexing websites for search engines or scanning books for text analysis,” he writes. And that’s all he writes about one of the most vexing doctrines in fair use weighing the most challenging technology ever confronted by copyright law.

Of course, even in one sentence, Gallaher manages to hide (or expose) the distinction that the purpose of many GAI products is to produce works without authors. This fact is highly distinguishable from the two analogies he cites and, as the courts will surely recognize, presents a novel challenge to the constitutional intent of copyright law. This is a consistent fallacy with every article of this nature—claiming that AI is the most revolutionary tech in history, but despite this novelty, we have ample case law to conclude that training is fair use.

Perhaps the courts will not wholly agree with my view that a purpose which does not serve the goals of copyright cannot favor fair use, but in Kadrey v. Meta, Judge Chhabria stated, “Courts can’t stick their heads in the sand to an obvious way that a new technology might severely harm the incentive to create, just because the issue has not come up before.”

Although that sentence prefaces a consideration of market dilution under factor four, the words “harm the incentive to create” allude directly to copyright’s core purpose and, so, implicates the purpose of the GAI to “create” in lieu of authors. And that goes to the question of transformativeness. So, no, it is not enough to say that a use which serves a different purpose is per se transformative, especially when that different purpose is to do exactly what creators do and, in the process, moot the utility of copyright law.

Notably, Gallaher masks the substitutional purpose of GAI by referring to it in general as technology that serves a “public good” and which provides “broad benefits.” The plain fact, though, is that we do not know this to be true. Simply because a product is new, being widely adopted, and/or has investors chomping at the bit is not evidence that its purpose is categorically beneficial. Far from it. We are already flooded with AI products causing serious harm, triggering liability claims for negligence and wrongful death, and launching emotional Senate hearings.

In this regard, I have argued that the courts have no factual basis for even defining the purpose of AI training. Although we should not talk about AI as a monolith, the counterpoint to that principle is that it’s generally the same process ingesting the same creative works, whether the AI product is used for scientific research, military applications, medical diagnosis, CSAM, social engineering attacks, or addicting children to establish dangerous “friendships” with machines.

Even if the courts are unwilling to apply such a broad sweep of uncertainty in a copyright context, it is sufficient to say that we have little reason to assume that AI is generally beneficial in the world of creative and cultural production.  And whether the folks at Public Knowledge know it, the courts are at liberty to look beyond the four factors in weighing fair use, especially when they are presented with considerations that have little or no precedent.

It is important to keep in mind that on fair use factor one, the often unwieldy transformative doctrine splits into two distinct branches of case law. The traditional purpose of fair use, dating back to English courts, is to allow new creative expression to flourish, particularly expression that comments upon the work being used. Fair use cases of this nature most often address one user of one work for one clear purpose.

The more contemporary branch of factor one considerations entails mass use of protected works for a technological purpose that can strain against the fair use doctrine. Simply put, fair use was not developed or codified into statute to provide raw materials for technological products, and as discussed in other posts, when the Second Circuit allowed scanning millions of books for Google Books, it stated that the case “tests the boundaries of fair use.” GAI products, whether used for good or ill, lie well outside those boundaries.

Articles like Gallaher’s are not really making a copyright argument but are instead drawing readers to conclude that copyright owners should be required to subsidize AI development whether they like it or not. Other than assuming that Public Knowledge is still a PR firm for Big Tech, I don’t know why an organization with that name takes such a position when countless parents, educators, artists, lawmakers, and medical experts are insisting upon guardrails and oversight for AI in recognition of social harm already being done. This same sober approach must apply to copyright rights and, at the very least, foster a licensing regime that avoids undermining foundational IP principles.


Image source: H9images

Finding Fair Use for GAI Training is Highly Problematic

fair use

Although I have expressed aspects of these views in several posts over the past couple of years, I will try to consolidate my opinion as to why GAI training with protected creative works is a more problematic fair use consideration than many, even the courts, seem to believe. I acknowledge that even fellow copyright advocates will disagree with some of this analysis, but here it goes:

For the sake of narrowing the focus to the question of whether training generative AI (GAI) with protected works favors a fair use exception, the following assumes that the training requires unlicensed copying of protected expression. Further, even if the GAI maker limits the product’s capacity to output infringing copies, this does not alter the fact that considering fair use for this purpose is, at best, troubling and, at worst, so disturbing to case law that the AI developers are begging the courts to articulate doctrine out of whole cloth.

A GAI’s Purpose is Not Analogous to Past Fair Use Factor One Findings

The courts have largely rejected the overbroad opinion that making “something new” is a sufficient justification for unlicensed use of protected works. Thus, it is difficult to see where any court finds an authority to support the argument that making a “creator robot,” however revolutionary its developers proclaim it to be, is a transformative purpose under a factor one analysis.

Typically, a GAI’s purpose neither expresses “critical bearing” on the works used (AWF v. Warhol) nor provides information about the works to human readers (Authors Guild v. Google) nor fosters interoperability in computer devices (Google v. Oracle). Instead, a GAI’s most widely applied and widely promoted purpose is artificial “authorship” without authors—a purpose which forecasts myriad negative effects that may prove to dramatically overwhelm any benefits promised by the developers.

Naturally, certain GAIs (e.g., ChatGPT) can be used for various purposes, about which more below, but if the courts are distracted by the sheer novelty, scope, and hype around the “importance” of GAI and, therefore, presume transformativeness, they may be persuaded to articulate a rationale that would be tantamount to a blanket exception for GAI training. If the court adopts this carve-out in the context of fair use factor one, the result would be a reversal of its own reluctance to favor the broad “something new” argument for transformativeness so recently rejected in Warhol.

Notably, it is not unprecedented for the court to articulate rationales beyond the four-factor analysis. In the Google Books case, the court found that the search tool provides a “social benefit,” and a similar sentiment was articulated in Google v. Oracle regarding consumer benefit in advancing mobile products. Or looking back at the Betamax VCR case, the concept of “time shifting” the viewing schedule served the public interest by expanding flexibility in the consumption of copyrighted material that was lawfully obtained.

But if the courts look for a rationale beyond the case law (e.g., a clear social benefit of GAI), not only will they be making a wild guess, but any conclusion in favor of the developers will probably be wrong—perhaps dangerously so. While it is understandable that the courts may be reluctant to hobble technological development in principle, the available facts militate against disturbing fair use jurisprudence for the sake of nurturing GAI in general.

Put differently, if the courts are going to take a wait-and-see approach, there is ample evidence that GAIs already cause harm to individuals—from CSAM and defamation to cheating and psychological issues—to say nothing of the well-founded anxieties—social, political, economic, and environmental—associated with this multi-trillion-dollar gamble being played by the same people who unrepentantly accrued wealth and power from the darkest results of Web 2.0.

GAI as a Tool for Creators

To the extent that a given GAI product may be considered a tool for producing creative works, a fair use holding should at least find that the tool “promotes the progress” of authorship with respect to copyright’s purpose. But this is difficult because the same GAI in the hands of one skilled creator offers little insight about its ultimate purpose in the hands of 100-million unskilled users.

At the positive end of considering GAI’s purpose, my friend David Bolinsky, a medical illustrator and animator, recently made a series of 8 dozen topically and stylistically distinct ten second animations, introducing speakers and segment topics for a scientific conference, a daunting assignment. GAI collapsed well over a year of work (if using his standard 3D animation tools) into a matter of weeks. He was surprised at the breadth and depth of creative latitude GAI enabled. Further, he explained that although these presentations allowed more creativity than his typical discrete medical and scientific educational animations, an amateur lacking his experience still could not have used the same GAI tools to achieve the same results. Consequently, Bolinsky sees GAI as an opportunity to do more and different kinds of work and not as a threat to his creativity or livelihood.

In this example, the technology is socially beneficial and arguably “promoting the progress” of authorship, which may favor a finding that the tool is transformative. That said, due to the human authorship requirement, we are years away from guidance as to the degree of copyright protection on those animations; and if GAI tools are used to produce millions of works that have no “authors” as a matter of law, it is contrary to find that this “promotes progress” in regard to copyright’s purpose.

Further, the difficulty for the court in considering fair use is that Bolinsky and his colleagues who specialize in medical work are unique among professional creators, to say nothing of the many millions of non-creator customers that GAI developers need—because they are leveraged into the stratosphere—to make their products profitable. This scale implies an analysis reminiscent of Sony—i.e., a question of whether the purpose of the GAI is substantially beneficial or substantially harmful. But knowing that requires time travel.

If a court could see a few years into the future and find, for instance, that the GAI at issue will be used substantially for nonconsensual pornography, disinformation, and scams, it would presumably decline to find these purposes are social benefits that favor an expansive transformativeness finding. Instead, at the moment, the courts simply have no idea what the true “purposes” are of various GAIs, which is unprecedented in fair use jurisprudence. The VTR, Google Books, Android phones, et al. did not serve materially different purposes years after they were presented to the courts in their respective cases. By contrast, GAIs present an incomplete and dynamic set of facts; and in my view, this alone should militate against finding that factor one favors any of these products.

The Threat to Authorship Itself

As stated in other posts and in comments to the Copyright Office, one unique challenge of GAI is that it poses a potential threat to authorship (i.e., that it will shrink the number of creative workers), which is clearly destructive to the progress clause and copyright law. Although my own view is that a party who poses an existential threat to copyright’s purpose should not be allowed to invoke one of copyright law’s affirmative defenses, I recognize the difficulty in that opinion.

Under U.S. law, copyright protects authors indirectly by protecting certain exclusive rights to use their works. Consequently, there is little foundation for arguing generalized harm to authorship itself, despite the overwhelming recognition that diversity in authorship has benefitted the United States both culturally and economically for almost two centuries. In this context, GAI provokes the question as to whether U.S. policy might shift toward a “moral rights” approach akin to Europe, but that’s a discussion for a different post.

Instead, the general threat to authorship is considered, to an extent, under fair use factor four, which weighs the potential threat to the market value of the works used. The key difficulty, however, is that if the GAI does not output the song “Ordinary” but instead outputs music in the style of Alex Warren, then the output is not, strictly speaking, a threat to the market value of “Ordinary” itself. While proposals like the NO FAKES Act would prohibit unauthorized replication of Warren’s voice, copyright law does not clearly prevent a GAI that makes Warren-like music that could theoretically obviate the need for Warren himself.[1]

For now, several plaintiffs in the roughly 40 active lawsuits GAI developers have presented evidence of outputs that are substantially similar to the works used in training, and this should disfavor fair use for the GAI developers under factor four. More broadly, plaintiffs in these cases argue that licensing works for the purpose of AI training is itself a market opportunity exclusive to the copyright owner, and therefore, the failure to license constitutes market harm under factor four.

Some courts may be reluctant to agree with the lost licensing opportunity claim, but that reluctance is unfounded—even if a developer successfully prevents its product from outputting copies of works used in training. So long as one of the exclusive copyright rights is implicated (and here, it would be the reproduction right), then a requirement to license exists. Consequently, failure to license, especially at such an extraordinary scale for unprecedented commercial venture, is unquestionably market harm to the copyright owner.

Even where there may be a close call on factor four, because the GAI developer should lose on factor one, and because factors two and three decidedly favor creator plaintiffs, factor four should not reasonably control in many of these cases. Moreover, the courts should pay scant attention to the claim by developers that the cost of licensing is existentially prohibitive to the development of GAI. In addition to the fact that this plea is barely tolerable from parties wildly spending billions on high-risk ventures, any claim that a license is “too costly” for any venture is no defense under copyright law. The copyright owner sets the terms for the use of her work, and the prospective user can accept those terms or not before using the work. If that rule applies to the bootstrapping indie filmmaker, surely it applies to Microsoft, Meta, Google, et al.

Conclusion

Fair use is a mixed question of fact and law, and I maintain that what should be most fatal to the developers’ fair use defense is that, like the public, the courts have insufficient facts about the ultimate purpose of GAI products. Just as with Web 2.0 in the late 1990s, we are witnessing unfounded political sentiment to once again let Big Tech do what it wants, preaching to the public that this time, the technology really will “solve the world’s problems.”

Of course, there is no rational basis for that belief beyond the self-interest of the developers and the investors losing billions every year. If past is prologue, Congress would live to regret the folly of allowing AI to run amok, just as Members of both parties now rue the unconditioned immunity of Section 230. In the meantime, while licensing copyrighted works for GAI training will not address all, or most, of the potential hazards of artificial intelligence, the courts should decline to adopt strained fair use rationales in the name of assumed progress that may turn out to be a complete disaster.


[1] I believe there are cultural reasons that militate against this result, but those predictions do not influence the fair use consideration.