The Courts Should Embrace the Novelty of Generative AI in Copyright Law

courts

Courts can’t stick their heads in the sand to an obvious way that a new technology might severely harm the incentive to create, just because the issue has not come up before. Indeed, it seems likely that market dilution will often cause plaintiffs to decisively win the fourth factor—and thus win the fair use question overall—in cases like this. – Judge Vincent Chhabria, Kadrey et al. v. Meta

In several posts, I have argued that generative AI (GAI) invokes novel copyright considerations on the basis that the technology has the potential to harm authorship itself, even where it may not harm specific works of authorship under traditional fair use analysis. GAI is distinguishable from any technology with which copyright law has had to contend, and if the courts will continue to guide the law to preserve copyright’s foundational principle—the incentive to create—they should recognize and even embrace the invitation to plow some new legal ground.

In the Copyright Office’s third report on artificial intelligence, one section introduces the notion of market dilution, which cites several comments including my own. Naturally, the AI industry rejects the premise that market dilution of all works, or even a certain type of work, is a valid consideration under copyright law. This argument, albeit self-interested, has some merit under traditional fair use analysis. Fair use factor four, which considers whether a specific use potentially threatens the market value of the work(s) in suit may be narrowly construed to reject the kind of generalized market harm implied by GAI.

But as the quote above reveals, Judge Chhabria in Kadrey et al. v. Meta (not even one of the strongest cases against AI developers) recognizes the novelty of this technology to undermine the foundational purpose of copyright law.  He also states, “…by training generative AI models with copyrighted works, companies are creating something that often will dramatically undermine the market for those works, and thus dramatically undermine the incentive for human beings to create things the old-fashioned way.”[1] This recognition of far-reaching harm to the “incentive” foundation for copyright addresses an even broader question than the term “market dilution” implies.

A Broader Fourth Factor Analysis

In the Copyright Office’s report, the section called Market Dilution offers guidance for a reading of fair use factor four that is broad enough to address the fact that GAI outputs can harm the overall market for the same kind of works used in training. Here, I would endorse a view that broadens the fourth factor consideration, which traditionally only looks to potential harm to the copyright owner’s exclusive right to exploit the works in suit.

As argued in other posts, and in my comments to the USCO, the courts should place considerable weight in deciding whether the use at issue furthers the purpose of copyright. My submitted comment the Office chose to highlight states: “[G]enerative AI—if it does not produce market substitutes—primarily represents potential harm to authors and future authorship. . . .[T]he consideration in the context of ‘training’ should be expansive and doctrinal—namely that a potential threat to ‘authorship’ cannot, by definition, ‘promote the progress’ of ‘authorship.’”

I believe that dichotomy, novel to GAI, is precisely what the courts must resolve in order to prevent the technology from swallowing copyright law itself—perhaps especially where a given AI product does not output unlawful copies of works used in training.  The one consideration that rescues GAI products as promoting the purpose of copyright is where they are demonstrably “tools” for creators, but this goes to my overarching argument that the courts likely cannot obtain sufficient facts to discover whether the “tool” is constructive, destructive, or agnostic with regard to copyright’s purpose.

An AI tool used for ideation, for example, may further the purpose of copyright by helping the creator discover a new path from idea to protectable expression, but it is impossible for the court to assume this is the general purpose of the “tool.” The same product might just as easily be used in ways that are destructive to authorship.

For example, the vast majority of material produced and distributed will not be copyrightable due to the human authorship requirement for copyright rights to attach. Additionally, we are already seeing a large volume of AI “slop” distributed on platforms like Amazon and Spotify, and it is well established that driving traffic to garbage content is a profitable model for those willing to engage in the practice. Although a specific bucket of AI “slop,” when considered in a traditional fourth factor analysis, may not directly compete with any specific works of authorship, the courts should continue to give weight to the undeniable fact that a market flooded with “slop” does not in any sense promote copyright’s purpose and is most likely destructive to that purpose.

This view does not ignore or dismiss the creative and cultural potential of GAI as a means of expression. Many popular videos online are made with (presumably) human-authored scripts combined with AI generated AV material. That the expressions in these works will generally be unprotectable is a valid basis on which to find that the purpose of the AI product does not promote copyright’s purpose. But further, the fact that many of the creators of these works are not incentivized by copyright rights—they are motivated by the opportunity to share ad revenue with the platforms—means that these works, regardless of their qualitative value, live outside the copyright system. As such, works incentivized and enabled by a model other than copyright cannot reasonably be held to further the purpose of copyright.

In my view, these considerations look beyond the typical factor four analysis, and even beyond the ordinary concept of market dilution, to ask a fundamental question:  Can a technology built by mass copyright infringement properly make fair use of works when the product’s ultimate purpose is either destructive or irrelevant to the purpose of copyright law? I don’t think so.

Is Denial of Licensing for AI Training a Market Harm?

A recent post by Copyright Alliance CEO Keith Kupferschmid states that both Judge Alsup in Bartz and Judge Chhabria in Kadrey erred by too hastily concluding that authors are not entitled to license fees for the use of their works in AI training. On that assumption, both judges held that under factor four, the claimants could not show market harm due to the defendants’ failure to license. Kupferschmid writes:

Both judges are incorrect because they ignore the important realities that a robust emerging market for licensing of AI training material already exists. Licensing markets under the fourth factor may only be circular and non-cognizable when the market being considered is a potential licensing market and the judge is trying to determine whether that potential market is too speculative. But when there is an actual market that already exists, the circularity argument has no place and both judges were incorrect to summarily claim the argument is circular. 

Notably, Judge Chhabria, in rejecting the existence of a licensing market for AI training, cites Tresona Multimedia v. Burbank High School, but in addition to Kupferschmid’s point that a licensing market already exists for AI training, I am not sure the court’s reference to Tresona even applies. Judge Chhabria quotes from the opinion thus: “In every fair use case, the ‘plaintiff suffers a loss of a potential market if that potential [market] is defined as the theoretical market for licensing’ the use at issue in the case.” However, the next part of the opinion reads as follows:

…a copyright holder cannot prevent others from entering fair use markets merely ‘by developing or licensing a market for parody, news reporting, educational, or other transformative uses of its own creative work.’ (citation omitted)

This appears to tie the question of whether a licensing market is merely “theoretical” to a finding of whether the purpose of the use is indeed transformative. And although both the Kadrey and Bartz courts found those uses to be transformative, I believe those holdings are so tautological (i.e., lacking proper analysis) as to be ripe for significant challenge. Notably, at issue in Tresona was an educational use of small amounts of musical works—a paradigmatic fair use consideration, and one that may be as far from the implications of generative AI as we might imagine. “Further, the Warhol decision calls into question whether fair use cases like Tresona are still good law,” Kupferschmidt said to me by email.

The interplay between factors one and four, while inherent to the fair use analysis, reveals a vexing circularity in the context of GAI where the court is persuaded to find that the remarkable nature of the technology is transformative solely because the use appears to serve a “different purpose” than the works used. In addition to not fully aligning with Warhol, Judge Chhabria’s well-founded instincts about authored works “competing” with voluminous GAI works under factor four cannot be comfortably harmonized with the finding that the AI product serves a different purpose under factor one.  Clearly, if the purpose of the input material is to entertain and inform and the purpose of the “competing” output material is to entertain and inform, these are not different purposes.

The important difference, then, is that the input works are human authored, about which copyright law speaks volumes, while the output works are machine made, about which copyright law says almost nothing. In general, GAI no more adds to the productivity of copyright than the sea steadily eroding stone into an aesthetically pleasing “natural sculpture.” The courts need not attempt to foresee whether GAI will be socially beneficial or harmful but only find that in context to copyright law there are far more reasons to disfavor fair use than to favor it.


[1] I would have preferred that Judge Chhabria had not used “old fashioned,” which may be improperly read to mean “outdated” in contrast to AI generated works.

Not So Fast: Some Oddities in the Anthropic Fair Use Opinion

anthropic

Headlines flood the feeds announcing that a California District Court sided with AI developer Anthropic, finding that LLM training with unlicensed works is fair use. While the headlines are true, I wouldn’t read the conclusions as gospel just yet. In the big picture, we are going to see a variety of fair use opinions in the more than 40 copyright cases against AI developers, and different facts with different legal arguments—and in different circuits—are likely to yield a mosaic of results. And odds are, there will be a question or two that goes to the Supreme Court. But in the meantime, in ruling on fair use for the defendant in Bartz v. Anthropic, Judge Alsup’s opinion includes some oddities that make it ripe for significant challenges in the Ninth Circuit. Here are the basics:

Anthropic acquired and ingested books in two ways to train the LLM it calls Claude. Initially, it scraped books from pirate libraries—an act the court held to be infringing, and which some legal experts have noted could result in a devastating damage award in a trial on that issue. But more significant as potential guidance to AI training overall, the court held that Anthropic engaged in fair use when it purchased printed books and “destructively scanned” those books to make digital copies it used to train Claude. The author plaintiffs did not allege that the defendant output any infringing copies of books that were fed into the model.

With that, the following oddball aspects of the decision stand out to me. The discussion is not exhaustive, and others’ opinions will vary:

Transformativeness as Near Dicta

In his opinion, Judge Alsup characterizes the purpose of the use as “spectacularly” and “quintessentially” transformative and “among the most transformative we will see in our lifetimes.” Yet, in the 12-page section weighing factor one, the opinion barely discusses a rationale for finding the purpose of the use transformative under any authorities. For instance, the opinion notes that like Google Books, Claude also does not make infringing outputs available, but the opinion does not articulate a cultural, social, intellectual, or other stated purpose of Claude as a rationale for finding it transformative. This apparently self-evident conclusion leaves one to wonder whether the novelty of the technology itself distracted the court from a proper analysis under the fair use doctrine. As Terry Hart writes on Copyhype:

In this context, a use is not transformative merely because it produces something new or technologically sophisticated. Rather, as established through decades of judicial interpretation, a transformative use is one that relates back to the original work by creating new information and insights about that work. Generative AI does not do this. It instead reappropriates the expressive content of the work to enable the generation of synthetic expressive content completely divorced from the original work.

Contradicting Warhol

The court’s apparently tautological holding that Claude is “transformative” on the sole basis that it is a high-tech achievement, contradicts recent fair use precedent, including the Supreme Court’s ruling in AWF v. Warhol. As Hart notes, and as discussed in other posts, making “something new,” no matter how cool, is not a sufficient basis for finding that a use is transformative. Because Judge Alsup does not persuasively explain why Anthropic’s purpose is transformative (let alone “spectacularly” so) in context to copyright law, we might expect this principle to be vigorously argued on appeal.

Odd Citing to First Sale Doctrine

The opinion finds it compelling that Anthropic legally obtained physical copies of books that were destroyed in the process of scanning them to make digital copies. Judge Alsup cites details like bindings being ripped off to emphasize the point that only one copy remained of each book and that the process of converting physical books into digital books served a transformative purpose…

Anthropic purchased its print copies fair and square. With each purchase came entitlement for Anthropic to “dispose[ ]” each copy as it saw fit. 17 U.S.C. § 109(a). So, Anthropic was entitled to keep the copies in its central library for all the ordinary uses.

The statutory first sale rule categorically allows you or me to “sell or otherwise dispose of” a copy of a lawfully obtained book by giving it away, reselling it, burning it, or using it as a doorstop—but it absolutely does not allow copying it for the sake of convenience, storage, or any other purpose. That the court cites §109 as permitting mass copying in a commercial venture like AI training is perplexing. Moreover, the opinion contradicts case law. As the S.D.N.Y. stated in Hachette v. Internet Archive “IA points to no case authorizing the first recipient of a book to reproduce the entire book without permission….”

Equally perplexing, the court appears to find that the unlicensed copying serves a transformative purpose separate from the overall purpose of LLM training…

Anthropic argues that the central library use was part and parcel of the LLM training use and therefore transformative. This order disagrees. However, this order holds that the mere conversion of a print book to a digital file to save space and enable searchability was transformative for that reason alone.

I cannot think of any authority to support that finding. “Saving space” and “searchability” served internal, operational purposes for Anthropic—purposes that cannot stand alone, but which can only tilt factor one toward fair use if the overall purpose of those operations is transformative. Yet, Judge Alsup appears to say that the overall purpose (despite being “quintessentially” transformative) is immaterial, thereby implying that merely converting print books to digital copies is transformative on its own. If that is what the opinion is saying, it contradicts every court’s rejection of such claims and endorses violation of the derivative works right of authors.

Shoehorning Infringement Considerations into Fair Use

In this and any copyright infringement case, the plaintiff(s) must argue that the protectable “expression” has been copied by the defendant. Here, the court finds the following:

Yes, Claude has outputted grammar, composition, and style that the underlying LLM distilled from thousands of works. But if someone were to read all the modern-day classics because of their exceptional expression, memorize them, and then emulate a blend of their best writing, would that violate the Copyright Act? Of course not.

True but problematic. First, the court should decide whether protected expression was unlawfully copied and then, if so, whether the copying is exempted by fair use. Here, the court appears to collapse these considerations into the factor one analysis by inaptly shifting attention to Claude’s outputs, which are not at issue in this case. Taken in combination with the fact that the opinion offers scant discussion for finding transformativeness overall, this citation to what copyright does not protect (§102) tucked into the fair use analysis suggests that the court is overly sympathetic to the idea that the purpose of generative AI is analogous to human authorship. This is an error, both as a matter of cultural interest and law.

Misreading the Purpose of Copyright Law

Further validating my concern that the court is somewhat distracted by the shiny new technology, the opinion states the following:

…Authors’ complaint is no different than it would be if they complained that training schoolchildren to write well would result in an explosion of competing works. This is not the kind of competitive or creative displacement that concerns the Copyright Act. The Act seeks to advance original works of authorship, not to protect authors against competition. [emphasis added]

As discussed in other posts, that view misunderstands the nature of GAI by assuming it “advances original works of authorship” in competition with “authors.” This is unfounded. The court has absolutely no basis to assume that merely because Claude is allegedly capable of outputting works that mimic “good writing,” that those outputs will be works of “authorship” as a matter of law. On the contrary, the better Claude is at “writing” without a writer, the more it will necessarily output non-human expression that is not protected by the Copyright Act.

It cannot be reasonable for the courts to find that the purpose of our copyright law is to breed mass production of uncopyrightable works or to incentivize the poor author to pose as a great author through the mask of technology. That is a recipe not only for widespread market dilution but for cultural and intellectual dilution in direct conflict with the constitutional rationale for intellectual property law.

Regarding market dilution, Judge Chhabria in his recent finding for Meta in the Kadrey case, criticized Judge Alsup’s inapt comparison to “training schoolchildren” and understating the potential market harm of GAI. Market dilution is a novel consideration discussed in the Copyright Office’s third report on AI, and that will be the subject of a new post. In the meantime, buckle up. The AI fair use ride is just getting started.

D.C. Event Shines Light on Advertisers Supporting Social Media Harm to Children

social media

When I was a kid in the 1970s and my father was a principal in an ad agency, they had the Ameritone paint account, and I remember him explaining that they were not allowed to show paint and food together in a commercial lest a child viewer be confused into thinking that paint might be edible. By contrast, a social media platform today is free to conflate child-focused material with illegal drug offers and numerous other conduits leading to serious harm or death. And it’s all swept under the rug of innovation and commerce.

Algorithms kill kids. Let’s just call it like it is at this point and stop pussyfooting around the rhetoric that social media platforms are neutral platforms for “information.” Never mind that information itself is almost a lost cause on social media, but the effects of algorithmic manipulation—even simple recommendations—can have disastrous effects for children and teens, including depression, anxiety, suicide, and accidental death. And that was before AI.

As reported last September, the accidental suicide of Nylah Anderson, age 10, was the result of TikTok’s algorithm prompting her to try the “blackout challenge,” which entails making a “game” of self-asphyxiation. In the case against TikTok for its role in leading Anderson toward the “blackout challenge,” the Third Circuit Court of Appeals articulated one of the few rational reads of the Section 230 liability shield. The court stated:

TikTok reads § 230…to permit casual indifference to the death of a ten-year-old girl. It is a position that has become popular among a host of purveyors of pornography, self-mutilation, and exploitation, one that smuggles constitutional conceptions of a “free trade in ideas” into a digital “cauldron of illicit loves” that leap and boil with no oversight, no accountability, no remedy.

Brought to You by Your Favorite Brands

Add to that cauldron the major brands whose advertising dollars unconditionally support social platforms, and that was the focus of this morning’s event held at the National Press Club. “We saw a great turnout,” says cyber-analyst Eric Feinberg, who has been engaged on ad-supported toxic social media content since 2013. More than 40 attendees filled the 40-seat room for the kick-off event designed to focus the attention of major brands on the fact that their ad dollars finance platform operations that cause serious harm and death to children and teens.

The event was organized and hosted by parents who have been working to turn personal tragedy into social change through both public policy and private action. For instance, one mother who spoke was Debra Schmill, who started the Becca Schmill Foundation after losing her daughter Rebecca to fentanyl poisoning from pills obtained with the “help” of social media. Becca’s death was the culmination in a cascade of terrible events intersecting social platforms—beginning with a rape at the age of 15 that was followed by cyber-bullying and the consequent battle with depression that led to the fatal pills obtained online. Deb Schmill is one of many parents determined to prevent other children and families from suffering similar fates.

“Women make 70% to 80% of all purchasing decisions,” Feinberg explained to me by phone after the event, “and these mothers who spoke today recognize that mothers just like them are funding social media harm to their own children.” Posting his daily mantra that “Brands are buying while kids are dying,” Feinberg has recently taken swings at McDonalds for its crossover promotion with Snapchat…

He makes a solid point. If a major brand overtly promoted the opportunity for kids to get closer to the local drug dealer, pimp, or sexual predator, parents would be outraged. But because social media is an insidious free-for-all, inhabited by good and bad actors, the worst vices are either overlooked or accepted as the cost of obtaining the virtues. But this is a false choice. Multiple defectors from these companies have made clear that the platforms bend their own rules and tweak their algorithms to promote anything that drives “engagement,” without regard to the consequences. And they assume the mainstream advertisers will keep paying without condition because they own all that engagement.

But as Meta whistleblower Sarah Wynn-Williams describes in her book Careless People, that company made an affirmative decision to target known teenage psychological vulnerabilities (e.g., body image) to promote certain products. This abuse of the technology is already unethical—a far cry from not showing paint and food on the same screen—and advertisers who knowingly exploit the “opportunity” should be held accountable by consumers. Meanwhile, as the organizers of today’s event strive to emphasize, that same algorithm exploiting the teen’s vulnerabilities will just as readily push dangerous drugs toward the child as promote a makeup product or gym membership.

By my lights, asking the advertisers to partner with their own consumers—the parents who buy their products—to pressure the platforms to adopt better practices is the very least they can do. In just a couple of months, it will be time for the ~$40 billion Back-to-School season, and as brands vie for the K-12 parents who make those purchases, they owe it to those families to pressure the digital-age media companies to stop killing kids.