AI Works Do Not “Compete” with Works of Authorship

"compete"

Many arguments advocating the view that AI training does not conflict with copyright rights  share a common fallacy, namely that AI outputs represent “competitive” works that copyright law was intended to promote. This error appears in Judge Alsup’s opinion in Bartz et al. v. Anthropic AI, in a report published by AI Progress, and in an amicus brief filed by three law professors in Thomson Reuters v. Ross Intelligence.

The competition fallacy rejects the notion of “market dilution,” which may be a novel, but not unfounded, consideration under factor four of the fair use analysis. Traditionally, the fourth factor inquiry considers whether the particular use of the work(s) in suit might potentially harm its/their market value. The question does not ordinarily weigh harm to, say, all sound recordings by virtue of having scraped all sound recordings to produce a machine that makes different sound recordings. Because the dilution principle would strongly disfavor AI developers, its proponents seek to portray the outputs as “competitive” works envisioned by copyright law.

As a threshold principle, although authors may be said to be in “perfect competition” or non-competition with one another, copyright’s purpose is not to promote competition but to promote as much diverse expression as authors may be inspired to create. Notwithstanding the use of AI as tools of human expression, it is an error to refer to AI outputs in general as “works of expression,” “works of authorship,” or any term of art that seeks to portray purely machine-made outputs as an intended consequence of copyright.

The inapt use of these terms perhaps indicates a hope that courts won’t notice the omission of the human authorship doctrine. But so long as that doctrine is affirmed (and it should be), we should only refer to AI outputs by other terms—choose the pejorative “slop” or the neutral “material” as you wish—in order to place outputs in proper context to copyright law. As argued here several times, if the material at issue is not protected by copyright on the basis that it is not made by a human, then its existence cannot be described as a “work” incentivized by copyright.

Judge Alsup’s Error in Bartz et al. v. Anthropic AI

Although the Bartz case itself is settled and will not be appealed, the reference to “competition” made by Judge Alsup will probably be litigated again in one or more of the many active AI training lawsuits. In his opinion, he wrote…

…Authors’ complaint is no different than it would be if they complained that training schoolchildren to write well would result in an explosion of competing works. This is not the kind of competitive or creative displacement that concerns the Copyright Act.

In addition to buying into the anthropomorphic comparison between machine learning and human education, Judge Alsup’s hypothetical “explosion of competing works” set off an explosion of criticism, including by Judge Chhabria of the same circuit, ruling in Kadrey et al. v. Meta. His response states…

…when it comes to market effects, using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.

I agree with this critique though, even here, would prefer not to see the word “competing.” Competition is generally creative whereas market dilution is generally destructive and closer to describing GAI’s effect on works of authorship and on copyright law. In fact, Judge Chhabria opines in Kadrey that, “As for the potentially winning argument—that Meta has copied their works to create a product that will likely flood the market with similar works, causing market dilution—the plaintiffs barely give this issue lip service.” This kind of signal that the market dilution theory has legal foundation is why I believe its critics rely on the competition fallacy.

The Report by AI Progress

The report titled AI Models: Addressing Misconceptions About Training and Copyright, written by Anna Chauvet and Karthik Kumar, PhD, engages in the competition fallacy, albeit in a context I tend to find baffling. I say this because the report first presents an in-depth technical argument as to why AI training does not entail infringing conduct but then devotes equal effort arguing that model training is fair use.

If this document were a legal response in court, not presenting a fair use defense would likely be malpractice, but as an experts’ report, the fair use discussion casts doubt on the scientific rationale for non-infringement. Where there is truly no basis for infringement, there is no reason to mention fair use. Yet, in rejecting a consideration of market dilution under factor four, the authors of the report reprise the competition fallacy thus:

If a new work does not use protected expression, it does not matter whether it competes in the same genre and market as prior works. An increase in competitive creative works is precisely the growth of creative expression that the Copyright Act was intended to promote.

Notably, the authors rely on traditional fourth factor jurisprudence in the first sentence but seek to foreclose any consideration of AI’s novelty by mischaracterizing its outputs in the second sentence. The authors err by referring to the mass outputs of a GAI as “creative works” at all, let alone as the type of works intended to be promoted by the Copyright Act. As stated in an earlier post, I believe the courts should recognize that GAI lacks any technological precedent and, therefore, should not demur to plow new ground in considering market dilution as a destructive consequence worthy of deep consideration.

Further, it is concerning when any party implies that the AI outputs do not matter in considering whether the training process is fair use. This is nonsensical and inconsistent with case law. The courts absolutely consider the specific utility of technologies that potentially infringe copyright rights, and it is impossible to weigh the purpose or market effect of an AI product without considering its outputs. After all, the outputs are its purpose.

The Professors’ Brief in Thomson Reuters v. Ross

Law professors Brian L. Frye, Jess Miers, and Mateusz Blaszczyk filed a brief in Thomson Reuters v. Ross, principally to argue that the headnotes copied from Westlaw are not properly subjects of copyright. Here, I will set that question aside, and frankly, whether the courts find the headnotes to be sufficiently original for protection is not particularly relevant to the challenges posed by AI.

In the latter part of the brief, though, the professors reprise the competition fallacy, stating, “The problem with the dilution theory is that producing similar, but noninfringing works is precisely the kind of competition copyright is supposed to promote.” Again, this statement is legally correct but factually misleading. If the professors want to argue, as they do, that the Federal Trade Commission et al. err by advancing a market dilution theory based on unfair competition law, perhaps that debate is worth having. But general statements that AI outputs, as non-works of authorship, inherently fulfill the intent of copyright law are flatly wrong. The brief continues…

The Act seeks to promote the creation of original works of authorship, not to protect authors against competition. Indeed, it is axiomatic that the purpose of copyright is to benefit the public by encouraging marginal authors to produce and distribute additional works of authorship.

Copyright does not protect authors against informal competition with one another, but as stated, that has nothing to do with “competing” with machines that output non-works by non-authors. As for the reference to marginal authors, this is both misstated and misguided. First, the Copyright Act is agnostic as to which authors become popular and which ones remain “marginal.” Second, as is always the case, it is the independent authors who are more likely to be marginalized into oblivion by unregulated, unethical, and unlicensed AI products.

There are several briefs filed in Thompson Reuters by many familiar names in anti-copyright circles, and no doubt, they all repeat some variation on the competition fallacy. But copyright law exists to incentivize human beings to devote time, talent, and energy to the production and distribution of creative and informative works. Copyright does not exist to mass-produce material, content, slop, or stuff by any other name that lacks creative expression by humans.

Mistakenly portraying the outputs of GAI as generally “competitive” with works of authorship produces a cascade of doctrinal errors that swirl in eddies of circular logic around the pillar of the fourth fair use factor. The courts should decline to be dragged into that vortex and, as Judge Chhabria at least implied, they should be willing to consider the diluted streams of creativity that can result from wanton use of AI.


Photo by Fizkes

Google v. Oracle XIII:  SCOTUS Should Be Skeptical of the “Sky Will Fall” Argument

I realize the Court will rule anytime now, and that I may be gilding the proverbial lily here; but I drafted this post in early January, and then the world got a little crazy and distracting. Anyway, FWIW, below is my last observation about Google v. Oracle. At least until after the decision. 🙂


In Google v. Oracle, the Supreme Court will render opinions on two legal arguments, either of which could have profound effects for different interests. The Court’s opinion on the copyrightability of Oracle’s “declaring code” will, in one way or another, be felt throughout the software industry, while the Court’s opinion on fair use will affect the entire ecosystem of creators in every category of copyrighted works.

As discussed in other posts, the Supreme Court should reject Google’s attempt to hyperextend the purpose and character of fair use, and in doing so, it should unanimously decline to transform modern copyright doctrine from the bench. In earlier posts, I discussed why Google’s claim that the code at issue should not be held uncopyrightable under the “merger doctrine” (§102(b)), which would have to affirm that the code at issue is a method rather than a form of expression. Nevertheless, the Court may feel hesitant to “upend the software industry,” if it is persuaded that finding copyrightability in Oracle’s code might have this result.

The most compelling argument in this regard is presented in the amicus brief filed by eighty-three computer scientists, which includes some of the most renowned names in software development over the last half century. It is hardly sensible for most of us—and certainly not for me—to debate that industry’s conduct with the likes of Steve Wozniack et al. If these experts say that “reimplementation” of software interfaces (APIs) is standard practice that the software industry has relied upon for decades, that statement must be given both deference and weight.

At the same time, we must keep in mind that “reimplementation” is not barred by copyright—that in fact much of the “open source” copying in that industry is bound by various conditions, which are defined by licensing agreements that are only enforceable under copyright law. In that regard, Java is a classic example of code that offers different tiers of licensing where, for instance, the educator may access all of Java for free, while the commercial user is subject to fees and other conditions. There is nothing remarkable or inherently stifling about these distinctions.

More specifically, as a question of law, even if we accept the computer scientists’ broad description of industry-wide reimplementation as fact, it tells us nothing about whether there is sufficient creativity in Oracle’s declaring code to qualify for copyright protection. In reviewing the various briefs filed by experts on both sides of this case, it seems clear that some declaring code is quite simple, and some is very complex—and creativity, presumably, expands with complexity. Further, there does not appear to be much if any quarrel with the premise that declaring code can be highly creative—easily creative enough for copyright to attach—and if that is correct, that should be the ballgame as a legal matter, regardless of industry practice and expectations. And Google has conceded that Oracle’s declaring code is creative.

This does not mean, however, that the Court will be wholly unsympathetic to the “standard practice” argument, or eager to disturb an entire industry if they believe this could be a consequence of its decision. So, let’s consider the argument a bit further, assuming the computer scientist amici are absolutely right on key facts, but perhaps a shade over-saturated in coloring their picture of the broader landscape relative to Google v. Oracle. For instance, I would pay attention to language in the brief that makes statements like, “Android is the most popular [mobile OS] in the world,” which is presented more than once in defense of Google’s reimplementation of the Java APIs to ultimately “transform” the mobile market.

That sentence caught my attention because the word popular implies consumer choice, which is in fact very limited in the mobile market. If the consumer is a dedicated Apple user, those phones are quite expensive. Alternatively, if the consumer needs a more affordable mobile device, she can choose among different phones that are nearly all running on one OS called Android.* And Android was not made freely and widely available as a gesture of Google’s largesse, or for the purpose of fostering competition of any kind.

While Google seeks to frame its free mobile OS as both generous and revolutionary, consumers have largely come to understand the digital-age axiom that if you’re not the customer, you’re the product. Google no more gives away Android “for free” than it does any of its other platforms. Consumers and various government agencies investigating antitrust practices fully recognize that the price of “free” has been to allow companies like Google to accumulate and manipulate data that is then used to alter consumer behavior, stifle small business in various markets, generate advertising revenue from the exploitation of often-questionable content, and, above all, to solidify their own market dominance.

This is not to say that if Google had licensed the Java code at issue, it would not still be the leading supplier in the mobile market—but that’s part of my point. The reason I homed in on this fallacy of Android’s “popularity” is that it informs a response to the claim in the computer scientists’ brief, which argues that “Uncopyrightable software interfaces address network effect barriers by enabling startups to plug into existing systems and innovate through cumulative improvements.” [Emphasis added]

While that sounds plausible as a generalization, in this particular case, the Court should be mindful that the forces buttressing Android’s market position—especially the network effects—render Google nearly immune to competition from startups. And these forces have little to do with copyright one way or another.

Android is a poor context in which to discuss “addressing network effect barriers.” Google’s market-share and wealth makes the company the very definition of a “network effect barrier.” As such, it seems equally possible that copyright (i.e. a mandate to license the code) is the only protection that a prospective startup has while attempting to thrive in a market presently conquered by the Googles, Amazons, and Facebooks. So, while a startup may get off the ground by copying some aspects of an already-dominant platform, the weakness Google now asks the Court write into copyright law would allow Google to turn around and copy the innovative aspects created by the startup, thereby crowding the startup out of business.

So, when the computer scientists’ brief describes competition in the market, it seems that it is often alluding to intramural competition on a technological playing field owned by one or a few prevailing companies. For instance, there may be competition among developers writing apps for the Android platform, but there is no startup, at least not in the American market, that can feasibly challenge Android for a piece of its share in mobile. And if such a startup were to emerge, it seems farfetched to allege that licensing declaring code, for instance, would be the barrier to stifle that prospective venture. Instead, it seems more likely that the barriers to that potential competitor are much more potent market forces that have little to do with copyright law in general, and nothing to do with the copyright questions presented in this case.

Are the Generalizations Instructive?

Quite possibly, the most intriguing segment of the computer scientists’ brief is where it describes how many developers, including Sun Microsystems itself in the development of Java, have reimplemented software interfaces in the process of bringing their products to market. This section presents a very clear portrait of standard industry practice, but it also reprises those two bugaboo questions I’ve asked before: 1) If unlicensed reimplementation has been so standard for so long, why did other commercial developers license Java declaring code for various purposes?; and 2) Why did Google itself almost enter into a license with Oracle that it only declined due to interoperability conditions with which it did not wish to comply?

Looking at this narrative as an outsider and giving all parties in the computer expert world their due respect, it is hard not to feel that, amid the generalizations about industry practice and innovation, some details are missing that are intrinsic to this case. Either declaring code is never the subject of copyright OR it is always the subject of copyright, OR some declaring code is properly protected while other declaring code is properly not protected. This latter conclusion would depend upon the amount of originality in the work, just like every other copyright category. And again, there seems to be consensus among all software experts that some declaring code can be highly creative, or as Deputy Solicitor General Malcolm Stewart described at oral arguments:

 …the briefs talk about the practice of copying interfaces or APIs, but those terms are very vague and potentially expansive. And a lot of things that might be called interfaces would be segments of code that are so short that they don’t exhibit necessary creativity, segments of code that are necessary to preserve interoperability. It may be that in particular circumstances, particular interfaces can be copied without authorization, but that’s not a basis for a general rule.

In other words, broad statements about industry practice, no matter how many names sign an amicus brief, can obfuscate the salient details in this case, as well as countless other scenarios in the software universe where reimplementation is ably supported by licensing agreements. This begs one of the real questions at issue, which is who benefits most from the bright-line rule the Court is being asked to make on the copyrightability of computer code—the independent software developer or the entrenched giant? While Google’s computer industry amici ask the Court to imagine how StartupXYZ benefits from copying GiantXYZ’s code, it also asks the Court to ignore the inverse scenario when GiantXYZ copies StartupXYZ’s code. It is easy to forget this when neither party in this lawsuit is a startup, but it is a question that should not be lost in a river of generalities.

Computer Scientist Brief Says Fair Use is Not Enough

Interestingly, the computer scientists’ brief asserts that a finding of fair use for Google would be of insufficient value to the software industry overall because this “would create uncertainty” in the trade. Naturally, a holding that declaring code is simply never protected is far more certain than a narrow finding of fair use in this one case, which would not preempt future litigation over copying the same kind of code. Thus, the computer scientists’ brief confirms that a finding of fair use would only help Google while asserting it would do little for the industry as a whole.

That’s just as well since finding fair use in Google v. Oracle would, I believe, be an error of law that would be holistically detrimental to creators in all industries. The fact that the defendant in this case happens to be directly responsible for evangelizing an extremely broad fair use doctrine, while reaping the financial benefits of widespread online infringement (e.g. on YouTube), is at least an aggravating factor, if not a dispositive one.

Returning to the questionable proposition that “uncopyrightable APIs” necessarily spawn competition and innovation, it is very hard to ignore the background narrative in which mass copyright infringement has been integral to Google’s acquisition of market share in various lines of business, thus producing the mother of all “network effects” such that parent company Alphabet—along with Facebook, Amazon, and Apple—is facing antitrust investigations in multiple countries. Simply put, words like competition are incompatible with Google’s conduct throughout the industry, and its monopolistic presence should at least color how the Court interprets the “standard practice” argument presented in this case.

If the Supreme Court can justly hold, as a matter of law, that the declaring code at issue is uncopyrightable under §102(b), then this is the only basis on which it should arrive at that finding. As for the broader implications for technological innovation, while it is certainly difficult to dismiss an august body of computer scientists, it is equally tough to reconcile the ways in which Android so dramatically belies their premise. Speaking as a consumer who feels pretty damned locked into very limited choices in mobile, I am simply not seeing the benefits of unlicensed reimplementation in this particular example.


*Though Microsoft is a player in mobile, it presently has a very small foothold.