Reversal in Thomson Reuters Case May Bode Well For Copyright Owners Against AI

Thomson

It has already caught the attention of most copyright watchers that Judge Bibas of the District Court for the District of Delaware (3rd Circuit) reversed his own 2023 summary judgment ruling in the copyright AI case Thomson Reuters v. Ross Intelligence. Thompson, which owns the legal research database Westlaw, sued Ross for copyright infringement after the latter built its competitive AI-powered search tool by copying over 2,000 headnotes from Westlaw. Headnotes contain summaries which the court finds are sufficiently original for copyright protection, and it also finds that the material is protected under the doctrine of “selection and arrangement.”

Judge Bibas found copyright infringement of the headnotes and held that Ross’s defenses, including fair use, all failed. It is the fair use ruling that may be predictive of outcomes in other cases alleging copyright infringement for the purpose of AI training. Notably, Judge Bibas held that fair use factors one and four favored Thomson, and that Thompson prevails overall on fair use. To review, my amended summaries of the fair use factors are:

  • The purpose of the use, including whether the use is commercial.
  • The nature of the work used (i.e., whether it is more factual or creative).
  • The amount of the work used, including whether the “heart” of the work was used.
  • The potential market harm to the work used, namely whether the use substitutes for a use that the copyright owner retains the exclusive right to exploit in the market.

In Thomson, it is compelling that the court finds factors one and four go to plaintiff and that these carry the fair use finding overall when factors two and three go to defendant Ross. I say this because in other AI cases involving ingestion of entire visual, musical, and literary works, factors two and three will surely go to plaintiffs, and the AI developers can only hang their hopes on factors one and four.

Under factor one, Judge Bibas held that Ross’s use was clearly commercial and that the purpose of the use serves essentially the same purpose as the works used. Here, the opinion uses language that could benefit other AI developers, but not necessarily. It states:

Ross was using Thomson Reuters’s headnotes as AI data to create a legal research tool to compete with Westlaw. It is undisputed that Ross’s AI is not generative AI (AI that writes new content itself). Rather, when a user enters a legal question, Ross spits back relevant judicial opinions that have already been written.  

On the one hand, that parenthetical note that Ross is “not generative” could be cited to argue that generative AIs like Midjourney or Udio favor a finding of transformativeness under factor one. But several of the strongest cases against the developers present similar evidence of “spitting back” copies of the material ingested. Further, as emphasized in Udio and Suno, two AIs built on ingesting protected sound recordings, plaintiffs also present a strong argument that the GAIs serve the same purpose as the works used and, therefore, the purpose is not transformative.

Where a court finds under factor one that an infringing use serves the “same purpose” as the work used, this will often, quit logically, lead to finding market substitution under factor four. Here, Judge Bibas is forthright in his reversal about his initial instinct to leave factor four as a question of fact to be decided by the jury. Most notably, in my view, he writes…

I worried whether there was a relevant, genuine issue of material fact about whether Thomson Reuters would use its data to train AI tools or sell its headnotes as training data. And I thought a jury ought to sort out “whether the public’s interest is better served by protecting a creator or a copier.”

Those first considerations from 2023 reprise two familiar arguments presented in fair use defenses, but which courts have generally found unpersuasive in recent high-profile cases. That the plaintiff is not yet in the market being pursued by the defendant has been held erroneous because it fails to properly consider the “potential” market for the protected works. Next, the “public interest” (i.e., for innovation’s sake) argument has been held too broad in major fair use cases—except Google v. Oracle, which is an outlier for several reasons. Thus, in reversing his thinking, Judge Bibas writes…

Even taking all facts in favor of Ross, it meant to compete with Westlaw by developing a market substitute. And it does not matter whether Thomson Reuters has used the data to train its own legal search tools; the effect on a potential market for AI training data is enough. Ross bears the burden of proof. It has not put forward enough facts to show that these markets do not exist and would not be affected.

Because factor two is generally considered the least important and factor four has long been considered the most important, Judge Bibas rests on that precedent to find that fair use overall favors Thomson. What this decision could signal for many AI developers who have copied millions of creative works to train their models is that the generalized “innovation and important for society” arguments will find slippery footing when they argue fair use.

David Newhoff
David is an author, communications professional, and copyright advocate. After more than 20 years providing creative services and consulting in corporate communications, he shifted his attention to law and policy, beginning with advocacy of copyright and the value of creative professionals to America’s economy, core principles, and culture.

Enjoy this blog? Please spread the word :)