Why Machine Training AI with Protected Works is Not Fair Use

As most copyright watchers already know, two lawsuits were filed at the start of the new year against AI visual works companies. In the U.S., a class-action was filed by visual artists against DeviantArt, Midjourney, and Stability AI; and in the UK, Getty Images is suing Stability AI. Both cases allege infringing use of large volumes of protected works fed into the systems to “train” the algorithms. Regardless of how these two lawsuits might unfold, I want to address the broad defense, already being argued in the blogosphere, that training generative AIs with volumes of protected works is fair use. I don’t think so.

Copyright advocates, skeptics, and even outright antagonists generally agree that the fair use exception, correctly applied, supports the broad aim of copyright law to promote more creative work. In the language of the Constitution, copyright “promotes the progress of science,” but a more accurate, modern description would be that copyright promotes new “authorship” because we do not tend to describe literature, visual arts, music, etc. as “science.”

The fair use doctrine, codified in the federal statute in 1976, originated as judge-made law, and from the seminal Folsom v. Marsh to the contemporary AWF v. Goldsmith, the courts have restated, in one way or another, their responsibility to balance the first author’s exclusive rights with a follow-on author’s interest in creating new expression. And as a matter of general principle, it is held that the public benefits from this balancing act because the result is a more diverse market of creative and cultural works.

Fair use defenses are case-by-case considerations and while there may be specific instances in which an AI purpose may be fair use, there are no blanket exceptions. More broadly, though, if the underlying goal of copyright’s exclusive rights and the fair use exception is to promote new “authorship,” this is doctrinally fatal to the proposal that training AIs on volumes of protected works favors a finding of fair use. Even if a court holds that other limiting doctrines render this activity by certain defendants to be non-infringing, a fair use defense should be rejected at summary judgment—at least for the current state of the technology, in which the schematic encompassing AI machine, AI developer, and AI user does nothing to promote new “authorship” as a matter of law.

The definition of “author” in U.S. copyright law means “human author,” and there are no exceptions to this anywhere in our history. The mere existence of a work we might describe as “creative” is not evidence of an author/owner of that work unless there is a valid nexus between a human’s vision and the resulting work fixed in a tangible medium. If you find an anonymous work of art on the street, absent further research, it has no legal author who can assert a claim of copyright in the work that would hold up in any court. And this hypothetical emphasizes the point that the legal meaning of “author” is more rigorous than the philosophical view that art without humans is oxymoronic. (Although it is plausible to find authorship in a work that combines human creativity with AI, I address that subject below.)

As a matter of law, the AI machine itself is disqualified as an “author” full stop. And the although the AI owner/developer and AI user/customer are presumably both human, neither is defensibly an “author” of the expressions output by the AI. At least with the current state of technologies making headlines, nowhere in the process—from training the AI, to developing the algorithm, to entering prompts into the system—is there an essential link between those contributions and the individual expressions output by the machine. Consequently, nothing about the process of ingesting protected works to develop these systems in the first place can plausibly claim to serve the purpose of promoting new “authorship.”

But What About the Google Books Case?

Indeed. In the fair use defenses AI developers will present, we should expect to see them lean substantially on the holding in Authors Guild v. Google Books—a decision which arguably exceeds the purpose of fair use to promote new authorship. The Second Circuit, while acknowledging that it was pushing the boundaries of fair use, found the Google Books tool to be “transformative” for its novel utility in presenting snippets of books; and because that utility necessitates scanning whole books into its database, a defendant AI developer will presumably want to make the comparison. But a fair use defense applied to training AIs with volumes of protected works should fail, even under the highly utilitarian holding in Google Books.

While people of good intent can debate the legal merits of that decision, the utility of the Google Books search engine does broadly serve the interest of new authorship with a useful research tool—one I have used many times myself. Google Books provides a new means by which one author may research the works of another author, and this is immediately distinguishable from the generative AI which may be trained to “write books” without authors. Thus, not only does the generative AI fail to promote authorship of the individual works output by the system, but it fails to promote authorship in general.

Although the technology is primitive for the moment, these AIs are expected to “learn” exponentially and grow in complexity such that AIs will presumably compete with or replace at least some human creators in various fields and disciplines. Thus, an enterprise which proposes to diminish the number of working authors, whether intentionally or unintentionally, should only be viewed as devastating to the purpose of copyright law, including the fair use exception.

AI proponents may argue that “democratizing” creativity (i.e., putting these tools in every hand) promotes authorship by making everyone an author. But aside from the cultural vacuum this illusion of more would create, the user prompting the AI has a high burden to prove authorship, and it would really depend on what he is contributing relative to the AI. As mentioned above, some AIs may evolve as tools such that the human in some way “collaborates” with the machine to produce a work of authorship. But this hypothetical points to the reason why fair use is a fact-specific, case-by-case consideration. AI Alpha, which autonomously creates, or creates mostly without human direction, should not benefit from the potential fair use defense of AI Beta, which produces a tool designed to aid, but not replace, human creativity.

Broadly Transformative? Don’t Even Go There

Returning to the constitutional purpose of copyright law to “promote science,” the argument has already been floated as a talking point that training AI systems with protected works promotes computer science in general and is, therefore, “transformative” under fair use factor one for this reason. But this argument should find no purchase in court. To the extent that one of these neural networks might eventually spawn revolutionary utility in medicine or finance etc., it would be unsuitable to ask a court to hold that such voyages of general discovery fit the purpose of copyright, to say nothing of the likelihood that the adventure strays inevitably into patent law. Even the most elastic fair use findings to date reject such a broad defense.

It may be shown that no work(s) output by a particular AI infringes (copies) any of the works that went into its training. It may also be determined that the corpus of works fed into an AI is so rapidly atomized into data that even fleeting “reproduction” is found not to exist, and, thus, the 106(1) right is not infringed. Those questions are going to be raised in court before long, and we shall see where they lead. But to presume fair use as a broad defense for AI “training” is existentially offensive to the purpose of copyright, and perhaps to law in general, because it asks the courts to vest rights in non-humans, which is itself anathema to caselaw in other areas.[1]

It is my oft-stated opinion that creative expression without humans is meaningless as a cultural enterprise, but it is a matter of law to say that copyright is meaningless without “authors” and that there is no such thing as non-human “authors.” For this reason, the argument that training AIs on protected works is inherently fair use should be denied with prejudice.


[1] Cetaceans v. Bush holding that animals do not have standing in court was the basis for rejecting PETA’S complaint against photographer Slater for infringing the copyright rights of the monkey in the “Monkey Selfie” fiasco.

AWF v. Goldsmith: The Need for a Workable Standard of “Transformativeness”

The Supreme Court on October 12th heard oral arguments in Andy Warhol Foundation (AWF) v. Lynn Goldsmith, and presumably every copyright nerd (pro and con) was listening. In general, I would describe the Court as consistent—all justices focused on the narrow question presented with very little discussion outside those lines. The question, which badly needs an answer, is this:  What is “transformativeness” under factor one if the fair use test?

Petitioner AWF’s argument is that a use of a protected work to create a follow-on work that contains “new meaning or message” is sufficient to find that not only does factor one tilt toward fair use but that “new meaning and message” should be determinative of outcome in any fair use consideration The fact that AWF narrowly asks the Court to consider this one question—and insists that the art world depends on their standard—demonstrates how much weight factor one has accumulated in the years since the decision in Campbell (1992).

Whether the Court opines more expansively on fair use, I think it is safe to say that it will decide whether factor one requires a more rigorous standard—namely whether a follow-on work must contain at least a modicum of comment on the work being used.  While we wait, the copyright skeptics and critics—many who have filed briefs in this case—will say that the fair use doctrine itself, and even the speech right, are in peril unless the Court sides with the Warhol Foundation. But this is simply untrue.

AWF’s very broad interpretation of the first factor analysis is certainly what many copyright critics would like to see, but they are asking the Supreme Court to maintain confusion on the question presented—to give a nod of approval to an application of fair use circumscribed by little more than the imagination of the copier and their lawyers. But if the Court rejects this expansive view—if it clarifies the sprawling confusion as to what “transformative” means—fair use as an affirmative defense will remain a vibrant and appropriately balanced aspect of U.S. copyright law.

Comment is at the Core of Fair Use

As argued in the past, it is no coincidence that the preamble of the statute (Section 107) cites exemplary purposes for fair use that imply the presence of some discussion about the work being used. While understood not to be an exhaustive list, “criticism, comment, news reporting, teaching, scholarship, [and] research” are named because the fair use doctrine evolved as judge-made law weighing cases entailing these types of uses. And because “purpose” is not defined beyond that illustrative list in the preamble, it is reasonable to hold that factor one of the four-factor test, which identifies the “purpose” of the use at issue in the very next sentence of the statute, should not become unmoored from the spirit of that preamble. 

Campbell does not stray from this principle. Some may disagree that parody exists in the follow-on work (2 Live Crew’s “Pretty Woman”), but as long as parody was the basis for the finding, the “comment on” requirement was met—albeit in context to a troublesome term of art called “transformativeness.” Since the popularization of that term in Judge Leval’s paper in 1990, courts and defendants have articulated factor one rationales so broad as to be undefinable. And AWF’s “new meaning and message” standard is precisely that—undefinable as a legal standard.

What is Meaning and Message?

In fact, a precedent case in the same lower court (the Second Circuit) illustrates some of the difficulties with the AWF arguments in this case. In Cariou v. Prince, follow-on artist, Richard Prince, rejected any notion of a “meaning” or “message” in the works he made using photographer Patrick Cariou’s images. What does a court do with AWF’s theory when the defendant himself will not define “meaning and message”? There, the Second Circuit held that some “new expression” was sufficient to find that factor one favored fair use—that comment on the original work is not required.

But tellingly, it was the Second Circuit’s own factor one language in the Cariou decision which provided the district court with a rationale for finding “transformativeness” in the Warhol screens, and which the Second Circuit then reversed stating:

“…the district court appears to have read Cariou as having announced such a rule…that any secondary work is necessarily transformative as a matter of law ‘[i]f looking at the works side-by-side, the secondary work has a different character, a new expression…”

Many legal practitioners have commented on the serpentine reasoning applied by the circuit court in order to square the Warhol and Cariou decisions—and all because “transformativeness” has become a doctrine without clear meaning.

Considering that a defendant artist might present any form of “message”—from baroque to minimalist to Richard Prince’s silence—and that it would be famously unwise for courts to apply a legal standard that turns on the judge or jury’s opinion of artistic merit, the Court should decline to engage in these semantic entanglements.  What is definable and identifiable is whether any comment on the original is present in the new work.  This would provide the factor one analysis with an articulated legal standard that judges and juries are able to apply in a principled way.

Necessity Implies Commentary

Several justices at oral arguments focused on the subject of “necessity,” asking whether it was essential that Warhol use Goldsmith’s photograph to make the “Prince Series” silkscreens. The rational answer to this question in this case is No. It was not necessary, in a legal sense, that Warhol use that particular photograph. Aside from the fact that a middleman (Vanity Fair) obtained the photograph from Goldsmith and provided it to Warhol, even if that were not the case, the necessity question is predicated on the commentary requirement.

Absent commentary on the work used, any alleged “need” for that exact work may be technical or functional but is not defensible as a matter of fair use. For instance, Goldsmith’s photo may be conducive to Warhol’s method because it’s a headshot in front of a plain background, but this kind of “need,” which facilitates the user’s process, is not a proper consideration for courts weighing fair use. Likewise, Richard Prince’s almost haphazard cut-outs of Cariou’s photos for some of his works are more suggestive of “opportunity” than “need,” and my guess is Richard Prince would confirm this assumption.

This view of necessity should focus attention on the question as to whether the expression in the original work has in some way been transformed through commentary upon that expression. By contrast, under AWF’s theory, nearly all uses of, for instance, underlying musical compositions would be fair uses merely by adding new lyrics to famous melodies. This is anathema to fair use doctrine in general and in conflict with Campbell in particular.

AWF’s Theory Did Not Exist in Warhol’s Time

Perhaps it is worth contemplating the legal landscape when Andy Warhol made the “Prince Series.” It was 1981, and most of Warhol’s career was behind him. The current Copyright Act—the first to codify fair use—had only been in effect for about three years, and the seminal application of Judge Leval’s “transformativeness” doctrine was still more than a decade in the future. It is understood that Warhol both appropriated and licensed photographs for his iconic works, though I doubt anyone could prove that he actively contemplated fair use—let alone considered the doctrine  as it has been applied or argued since Campbell.

At present, expansive applications of “transformativeness” have resulted in holdings that treat factor one as the dispositive consideration. Several empirical studies find that a defendant who wins on the “transformative” question is almost certain to prevail on the fair use defense overall,[1] and even where factor one is reasonably the most compelling, this only emphasizes the need for clarification as to the meaning of “transformativeness.”

Far too many decisions, especially in district courts, have placed fair use in direct conflict with the derivative works right, which is little surprise when the word “transformed” appears in Section 101 defining “derivative works.” Further, district courts (e.g., Brammer v. Violent Hues) have applied untenable interpretations of “transformative” in conflict with the most fundamental licensing models under the reproduction right, possibly resulting in needless time and expense on appeals for both parties.

Rather than holding that some evidence of “new expression” almost always carries the day, the other three fair use factors should be given proper consideration by mitigating the apparently mesmerizing effect of “transformativeness.” AWF and its amici’s assumption that its test would create more certainty in the art world is a plea for the status quo in which courts will continue to misapply factor one because “meaning and message” are often undefinable to the point of capriciousness. By rejecting AWF’s over-broad standard, the Court can clarify the vagueness which, since Campbell, has caused unnecessary confusion for rightsholders, users of works, and the courts.


[1] For instance, “Is Transformative Use Eating the World?” – Asay, Clark D. et al.  https://lira.bc.edu/work/ns/5f6a0b59-6497-4457-a063-153dae3cee94


On Posting Fair Use Notices in Creative Projects

Imagine someone getting caught shoplifting while wearing a tee shirt that says: “I have no intention of committing petty larceny.” Right? So, when the store presses charges, the defendant’s attorney is probably not going to say, “But the tee shirt your honor! Did you read the tee shirt?”

It’s not a perfect analogy. But this parable of the absurd is not far from the kind of legal prophylaxis I see attempted when creators place fair use notices on work they made using some amount of somebody else’s protected material. These notices, which are often based on reciting the preamble to Section 107 of the Copyright Act, appear in the credits of videos or printed along with written material or posted on websites. “These materials are used under the doctrine of fair use …”

Yeah. Don’t do that.

For one thing, if your use of a work is arguably a fair use, then posting such a notice does not make that defense any stronger. Alternatively, if your use is not likely a fair use, the notice will be no more relevant in a potential infringement claim against you than the aforementioned “no larceny intended” tee shirt would help the shoplifter.

If anything, posting a fair use notice is calling attention to the fact that you have willfully used a protected work without license and that you have not properly considered fair use (because nobody who understands fair use would post such a notice). Hence, the notice may only beg for scrutiny and invite a letter from the copyright owner’s lawyer.

Clearly, I am not seeking to advise the willful or reckless infringer to help him get away with it. But I have seen enough friends and colleagues post these fair use talismans on their creative projects, and then I cringe and rub the lucky rabbit’s foot on their behalf. So, what should you do when using works in a way that you sincerely believe are exempted by fair use?

If Possible, Get a Fair Use Analysis from Qualified Counsel

Ideally, it is best to get advice from an attorney who specializes in copyright law. They may tell you why a use you think would be fair use isn’t, or they may draft a fair use analysis for your project. These documents are not talismanic either—nothing guarantees that a party won’t try to sue another party—but if the attorney knows her stuff, the fair use analysis gives you a degree of comfort that the matter has been thoroughly considered in the proper light. That document would be central to a first response to a copyright owner who might make a claim against you; and if the analysis is well-founded, it may be sufficient to end the matter right there.

DIY Fair Use Analysis

But if getting a fair use assessment from an attorney is outside the budget—especially for a project that is speculative or non-commercial—you can do a basic analysis yourself by following the four-factor test and answering the questions honestly and literally. I stress literally to recommend not getting distracted by the complexities of high-profile fair use cases and legal theories. That way madness lies. Even for the experts. Keep it simple. What follows is my best attempt to keep it simple, though it is far from perfect.

Factor One – Purpose of the Use

To think about Factor One, ask yourself two questions and be very honest:  1) is your use commercial in that you may derive some marketable value from it, even promotional value? 2) does your new expression comment in some way upon the protected work being used?

If the answer is Yes to commercial and No to comment, you are already on somewhat shaky ground for a fair use defense. But what does comment on mean?

If you’re doing the analysis yourself, be literal. Are you criticizing, parodying, or analyzing the original work in some way, rather than, perhaps criticizing, parodying, or analyzing something other than the original work? A big question here is whether the original work is doing a lot of the heavy lifting as part of the new expression you are creating? Here are two typical examples:

In the classic song with new lyrics format, that pre-existing song  provides a lot of the creative expression and is doing a lot of work for you. But did you select it because it’s a famous song and there was some natural wordplay between the original lyrics and your new lyrics? That would be typical, but unless your new lyrics truly comment on the original song itself, the use is likely outside the fair use exception.

A photo in a “news” context favors fair use if the writer or reporter is commenting on the photo itself, or perhaps the photographer. But if the image is an illustration to accompany a story (e.g., about the subject in the image), this is generally a use requiring a license.

A final note about Factor One: An “educational” use does not mean your blog that may be highly informative. It mainly refers to classroom learning. So, if you are not a teacher or instructor using a work for that purpose, don’t assume “educational” applies simply because you are conveying information.

Factor Two – Nature of the Work Used

This is one of the easier fair use questions, which mainly asks whether the work used is informational or creative—like a non-fiction book versus a novel. This is because copyright protects creative expression, but not facts. So largely fact-based works are more susceptible to fair use. If the work you’re using is informational, your use is more likely fair use; if the work used is creative, your use is less likely a fair use. This does not mean, of course, that you may automatically copy an extraordinary amount of an informational work, which brings us to Factor Three.

Factor Three – Amount of the Original Work Used

After Factor One, the amount used question is probably the next most misapplied concept. There is no set amount of an original work (like 10%) that is automatically fair use. The most important questions to ask yourself here are: 1) did you use the heart of the original work (e.g., the refrain in a song, or the most prominent element(s) in a visual work); and 2) did you use only as much of the original as necessary to fulfil the purpose of your new expression?[1]

So, Factor Three may refer back to Factor One and that question as to how much labor is the original work really doing for you within the new expression you are making, and do you really need to copy so much? Imagine removing the protected work you’re using and consider how your project changes. If the honest answer is that the original work is conveying a lot of the overall expression and you have used a lot of that work and you are not commenting on the original, then fair use may not be on your side.

On the other hand, if you can reasonably say that you are using only a portion of the original work—and especially if your expression in some way comments upon that original work, you are more likely protected by the fair use exception. Note: if you are using a fragment of a work, you may also be protected by the doctrine of de minimis use (lawyer for “just a tiny bit”), and fair use may not need to be considered.

Factor Four – Potential Harm to the Market for the Original

This factor has often been confused with questions like how much the unpaid license would be or how wealthy the copyright owner already is or how long the original work has been in the market. These lines of reasoning incorrectly try to measure the harm of a use relative to various assumptions about the value of the work used.

For instance, if the work being used is already worth millions, the assumption may be that it’s tough to do it much harm; or if the work is not already worth a lot in the market, the same assumption may be made for the opposite reason. But this is wrong.

In the simplest analysis, if those other three factors are not looking like your use is a fair use, then at the very least, you are likely depriving the copyright owner a valid opportunity to charge a fee for the use you wish to make. Considering market harm can be this simple.

But more broadly, a court would ask, “What would happen to the copyright owner’s licensing opportunities, if multiple parties were to make uses just like yours?” The key word here is potential, and while it can get complicated in some cases, it is often just common sense. Especially if you are a creator, ask yourself whether you could license your work for the use you intend to make.

If the honest answer to that is at least “Maybe,” then you’re thinking about potential market harm in the right way because that maybe would probably first depend on who’s using it for what purpose and how much of your work they’re using. See what I mean? As a creator, you should be at an advantage considering fair use because you can apply the thinking as if your work is being used by somebody else.

Finally on the market harm question, the argument that your use may increase the value of the protected work(s) through exposure is not a factor in a fair use defense and is a consideration at the discretion of the copyright owner. There may be reasons other than financial why an owner will object to a use, and if a use is infringing, it’s infringing.

There is no question that fair use is a tricky subject, and the above suggestions could be peppered with caveats, disclaimers, and a variety of other opinions. Because the consideration will always be based on the specific facts of each use, nobody can offer general guidance that will necessarily apply to your next intended use of protected work. But the fact-intensive nature of the fair use doctrine is a key reason why those boilerplate notices are meaningless and potentially damaging to your good intentions. So, whatever you do, don’t do that.


[1] Views differ as to how much “purpose” may influence “amount.” But to keep it as simple as possible, if the purpose favors fair use and you copy as little of the original as possible, the more likely you are to have a reasonable fair use defense.

Photo source by: kastaprav