Are Creators Aligned on Artificial Intelligence?

creators

One of many challenges with adoption of generative AI (GAI) tools is whether creators are willing to demonstrate a degree of solidarity on the matter—i.e., apply the principle we generally call fair trade. If Creator A uses a GAI that might be harmful to Creator B in a different field, and so on, will most creators take this broader perspective in a group effort to demand ethical uses of GAI?  Moreover, this question becomes intertwined with copyright because the use of GAI is a subject of evolving legal doctrine, meaning that creators who want to produce commercial content outside their core talents should be aware that the material produced may not be protectable under the law.

Two simple examples would be the self-published book author who might use an AI voice app to produce an audiobook, and the documentary filmmaker who might use an AI music generator to produce a soundtrack for a film. In both examples, creators in other fields—voice actors and composers respectively—are potentially harmed by the development and use of these AI tools, but 1) will the author and filmmaker take that consideration into account?; and 2) will the sound recordings in either case be protected by copyright?

In the case of the author using AI in lieu of hiring a narrator to produce the audiobook, I predict that under current doctrine, the sound recording would not be protected by copyright law because there is no human performance captured in that recording. Thus, remedies for any piracy of the audiobook would rely solely on the protection of the underlying literary work, which is effective—but if the sound recording is also protected and registered, that would be two works infringed instead of one.

This increases the potential damages for infringement, which puts the author/owner in a stronger position if she needs to take legal action. By this example, authors’ interests may be seen as aligned with those of professional book narrators. Hiring a narrator will not only achieve better quality in the reading, but capturing the human performance is also a basis for copyright attaching to the sound recording.

Similar considerations would apply to the filmmaker with the GAI soundtrack, although there may be other factors that provide the AI music with some protection we don’t find with the AI audiobook. One factor that may become relevant is whether the filmmaker can show that he exerted sufficient creative control over the final sounds. If so, he may be able to defend a claim of copyright in the soundtrack, but we are likely several years and a few lawsuits away from clear guidance on this question.

Another consideration with the soundtrack may be the Copyright Office’s current view that material using assistive AI “within a larger work” is protected. Creators should be careful about interpreting that broad language because constituent works that stand alone—and this would apply to a soundtrack for a film—would logically not be independently protected.

Of course, there are many GAI products that allow one type of creator to avoid hiring another type of creator for a given project. Some of this is inevitable, and it is not necessarily unethical or bad for creative culture. That said, even with ethically trained and ethically used AI tools, the copyright considerations should be weighed by the individual creator (i.e., do they care about protecting what might not be protectable?), but also collectively by all creators contributing to a new ecosystem.

Since 1978 in the U.S., the default is automatic copyright protection, even if most rights are never enforced. But as GAI is used to produce a lot of material that is not protected, it is hard to predict what effect this might have on copyright overall. Even older than automatic copyright with the 1976 Act, the human authorship principle fosters a new tension for creators who may wish to combine GAI and human-authored work. As a response to that tension, it would be a mistake in my view to overwrite the “human spark” doctrine and simply protect any material that “walks and talks” like a creative work. This isn’t just an emotional appeal to anthropocentrism but rather a conviction that copyright would become meaningless—even unconstitutional—by eroding the incentive rationale for its existence.

Regardless of the theoretical questions addressed in this post, I believe that as a practical matter, creators should think carefully about how and when to use GAI for various projects. As an ethical consideration, perhaps if you’re opposed to “scraping” in your industry, then opposing it in others is the right view to take. But as a business consideration, if what you’re making is meant to have commercial value, AI-generated might mean not protected by copyright—and that means even if you spend money and time on it, it isn’t yours.

Guarantee of Confusion: When AI Scrapes the News

news

That title riffs on the term of art in trademark law known as “likelihood of confusion.” It refers to a foundational test, which asks whether the average consumer will confuse a particular mark (words, design, or both) with a product or service that is not produced or distributed by the company associated with a known mark. Thus, beware the Rollex, Tilynol, or even the KleanEx. But when a real trademark is used to promote a defective product, confusion is certain—especially when the brand is a news producer.

In a lawsuit filed today by several major news publishers against an AI developer (Advance Local Media et al. v. Cohere Inc.), we see a good example of copyright and trademark combining to serve the public interest in contrast to the extensive harm that can be done by technology developers running roughshod over IP rights. Copyright incentivizes the investment in professional journalism needed to report reliable news, and trademark identifies the source of the news we choose to trust. I know readers will be inclined these days to criticize one news organization or another, but hold that thought.

The complaint filed in the District Court for the Southern District of New York names as plaintiffs several well-known news publishers (e.g., Condé Nast, Los Angeles Times, The Guardian) who allege that AI developer Cohere is liable for both copyright and trademark infringement. Valued at $5.5 billion, “Cohere’s primary product is its suite of LLMs referred to as the Command Family of models…these LLMs are trained on vast amounts of text and as a result can generate text-based, natural language responses to user queries,” the complaint states.

The Copyright Allegations

On copyright infringement, the publishers intend to show that Cohere violates their exclusive rights both when it inputs protected works to train the Command products and when it outputs verbatim or substantially similar works that are reproduced, distributed, and displayed to paying customers. The two counts of alleged trademark infringement stem from use of the publishers’ registered names in conjunction with erroneous material that may be “hallucinated” by the LLM. Clearly, anyone can recognize why this would be harmful to the reputation of the named source and broadly harmful to consumers who already struggle to validate information in this miasma we call the internet.

Notably, the Publishers stress the fact that Cohere markets itself on the reliability and timeliness of the information Command provides—benefits that would be essential for its many commercial customers, but which the company allegedly chose to accomplish through unlicensed use of the works produced by news organizations. “Cohere relies heavily on trusted journalism sources to shore up the authority of its responses. As Cohere’s CEO Aidan Gomez explained in a letter to employees and shareholders, Cohere believes that a ‘key differentiator’ for its models is the ability to receive ‘verifiable answers,” the complaint states.

Further, to support the veracity of query results, Cohere relies on “retrieval augmented generation” (RAG), which an NVIDIA blog post describes thus: “Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers — grounded in specific court proceedings or similar ones — the model needs to be provided that information.” This case law analogy is ironic in context because even at this very early stage, the copyright case law strongly suggests to this observer that Cohere should not have chosen the unlicensed path to build its products.

For example, a description from the complaint reminds me that the news summary product TV Eyes was held to be infringing on less compelling evidence than the following:  “The user can expand [the] Under the Hood [tool] to view the exact underlying documents on which Cohere relied to generate the response. Cohere refers to these sources as ‘snippets,’ but to be clear—these ‘snippets’ are generally the full text of every source on which the output was based.”

In fact, the allegations in this complaint imply so much familiar ground that it is hard to imagine how Cohere will raise a persuasive defense. For instance, just this week, I summarized the Delaware District Court finding that comparatively limited copying of Westlaw’s headnotes for an AI search product was considered a market substitute for the protected works. What Cohere is allegedly doing with news articles is similar in purpose but entails far more extensive, unlicensed use of substantially more protected expression than in Thomson Reuters v. Ross.

The Trademark Allegations

With the RAG tool switched on, Command will apparently provide reliable news by copying, distributing, and displaying unlicensed copies of Publishers’ works. But with RAG switched off, its LLM might hallucinate and then attribute the resulting misinformation to one of the named plaintiffs. For instance, the complaint cites a Cohere “article” that confuses the 2023 massacre at the Nova Music Festival with a 2020 shooting in Nova Scotia; reports that a man murdered at the latter “returns to the scene” of the former; and then attributes this whole mess to The Guardian.

The Publishers allege that Cohere violates two counts of the Lanham Act—trademark infringement and false designation of origin—both of which seem highly plausible based on the facts presented. We shall see whether Cohere can present compelling facts to rebut the allegations, but otherwise, as to the questions of law in this case, I predict this one easily goes to the plaintiffs.

As mentioned above, I know some readers may scoff at the premise that quality journalism is consistently the hallmark of well-established news publishers today. And to be sure, one must occasionally check the math in various articles and editorials. But I maintain that Big Tech, through its predatory model of monetizing everything it does not create—plus our willingness to believe utter nonsense online—exerts a pressure on professional journalism that borders on an existential threat. Left unchecked, the AI shenanigans like those described in this lawsuit do more than violate IP law; they undermine the efforts of any reporter who is still trying to present reality.


Photo by AndreyPopov

USCO Issues 2nd Report on Artificial Intelligence: Copyrightability

copyrightability

“Where AI merely assists an author in the creative process, its use does not change the copyrightability of the output. At the other extreme, if content is entirely generated by AI, it cannot be protected by copyright.” – Copyright and Artificial Intelligence Part 2, Copyrightability, USCO –

Last week, the U.S. Copyright Office released Part 2 of a planned three-part report on copyright and adjacent IP matters concerning the use of artificial intelligence. The new report expresses the Register’s views about the copyrightability of works when they are produced in some way with the use of AI. In summary, the Office reaffirmed the doctrine that human authorship is required for copyright to attach to a work at all; that copyright should not protect expression created by generative AI; and that the use of assistive AI should not disqualify a work for copyright protection.

Before proceeding, it’s important to remember that the question of copyrightability, or “authorship,” with AI tools is separate from the legality of unlicensed use of creative works for the purpose of “training” these models in the first place. As argued in other posts, most machine learning (ML) with unlicensed protected works should be held to be mass copyright infringement and should not be exempted under the fair use doctrine. Nevertheless, on the assumption that AI tools for creative work will continue to exist, the question of copyrightable authorship with these technologies is an important and ever-evolving doctrine.

Generative AI (GAI) and Copyrightability

The most difficult copyright question regarding generative AI (GAI) concerns works made with a combination of human-authored and AI-generated expression. As the Office report emphasizes, the question itself defies bright-line guidance because it is inherently a case-by-case, fact-intensive consideration that can only be weighed in the courts. That said, the report expresses a general view that GAI apps do not presently allow the user sufficient control over the expressive results to claim ownership in the outputs.

While the Office recognizes that selection and arrangement of GAI material can meet the threshold for copyrightability, and it leaves open the possibility of technological advancements to enable greater “control” of GAI tools, the report argues that GAI is presently a “roll of the dice” as a creative process. “No matter how many times a prompt is revised and resubmitted, the final output reflects the user’s acceptance of the AI system’s interpretation, rather than authorship of the expression it contains,” the report states. Acceptance is described as “authorship by adoption,” which is roughly the equivalent of claiming copyright in a work one finds rather than creates.

What this means as a practical matter is that creators may claim protection of their expressive contributions to works that include GAI material, but the latter should be considered unprotected and, therefore, disclaimed in a registration application. We shall see whether the courts agree with the Office, most immediately in the case Allen v. Perlmutter where Jason Allen argues that the nature and variety of prompts he used for his visual work were not like rolling dice but were instead deliberate steps toward creating his mental conception of an image.

Regardless of how Allen is decided, it will only be the first major litigation addressing the mixed human/AI question at issue. This highly subjective consideration will remain a case-by-case matter for the foreseeable future, even if certain GAI apps provide greater “control” for users per the Office opinions.

Assistive AI (AAI) Does Not Limit Copyrightability

“The Office agrees that there is an important distinction between using AI as a tool to assist in the creation of works and using AI as a stand-in for human creativity.”

As a creator, I appreciate the Office distinguishing GAI from assistive AI (AAI) and stating that the latter should generally not disqualify works from copyrightability. For instance, if one uses AAI to expedite color correction in a group of photos or to more efficiently check and make grammar recommendations for a manuscript, one need not disclaim the use of AI in these contexts. Likewise, it is important that the Office recognizes that AAI used within a larger work (e.g., to fix a scene or create an effect in a motion picture) is not a basis to limit the protection of the whole work.

While there may be lines inevitably crossed (e.g., an AI suggests, and the writer copies, whole paragraphs in a text), this would arguably be a case when AAI becomes GAI. Nevertheless, resolving protection in this gray area of authorship is likely a matter best left to the courts and not a line easily drawn by the Copyright Office. In practice, even if I did use AAI in my own work, I would not disclaim that use in a registration application, but if I allowed AI to truly write some material, I would disclaim that and not submit a fraudulent application.

Creators should remember that under Unicolors v. H&M, an innocent error on a copyright registration application is not a basis to void the registration. It is important to make a good faith effort to claim the human-made expression and disclaim the AI-generated expression, but the Supreme Court set a precedent that creators should not be penalized for an imperfect understanding of difficult questions of law when submitting an application.

It is understandable, of course, that creators want certainty, but in this report, I think the Office provides sound guidance for the moment while cases like Allen work through the courts. It would not be acceptable to simply default to protecting all GAI material while so much “AI slop” floods the market and, among other things, threatens to undermine the incentive purpose of copyright. For the author using AI in conjunction with her own talents and expressive capacity, we are at the leading edge of this discussion. For context, publishing has existed for a few centuries, but defining “publication” in U.S. copyright law still defies bright-line definition to this day. Hang in there.