AI “Training” Still an Open Copyright Question

On October 30, Judge Orrick of the Northern District of California largely granted the AI companies’ motions to dismiss the class-action complaints filed by Sarah Andersen, Karla Ortiz, and Kelly McKernan on behalf of all visual artists whose works have been used without permission for the purpose of “training” generative AI models. Several complaints were dismissed with leave to amend, but without detailing every allegation, dismissal, and possible cure, a few points are noteworthy for creators watching these developments with understandable anxiety.

First, the dismissals are not surprising because several of the complaints were not well founded in law. For instance, as discussed in other posts, the claim that all outputs of the AI systems are unlicensed “derivative works” of the works ingested is a football bat[1] of an argument. “I am not convinced that copyright claims based a derivative theory can survive absent ‘substantial similarity’ type allegations,” states Judge Orrick. One may be hard-pressed to find a copyright advocate who would disagree with that statement because a “derivative work” must share some protectable elements derived from the originally protected work.

Also, of note as both a matter of civil procedure and enforcing one’s rights in general, the copyright allegations by plaintiffs McKernan and Ortiz were dismissed with prejudice[2] for the simple reason that neither artist named works in suit that were registered with the U.S. Copyright Office. Although a class-action copyright suit can be filed on behalf of “all artists,” who created works that will not be registered, the named plaintiff(s) must allege infringement of registered works identified in the complaint. Timely registration (generally before the infringement occurs) is a prerequisite to filing a lawsuit in federal court.

On a more positive note, the court did not dismiss Andersen’s allegation of direct copyright infringement by Stability AI. Here, Judge Orrick finds that the complainant reasonably alleges that illegal copying occurs as part of Stability’s “training” process and, therefore, triable issues of fact are presented which cannot be dismissed at this stage. As indicated in older posts about these cases, this question—namely infringement of the “reproduction” right §106(1)—will likely be the most illuminating for both creators and AI developers as to where the legal boundaries lie when it comes to “training” with protected works.

On a related note, I was reviewing the comments submitted by the Computer & Communications Industry Association (CCIA) to the Copyright Office NOI on artificial intelligence. Although I do not disagree with every conclusion (e.g., on copyrightability of AI-generated works), CCIA is so dead certain that “training” with protected works is fair use that it states, “No one should have the ‘right’ to object to an AI model being trained on their work.” Of course, this overstatement was the first sentence in an answer to an odd question by the Office, which asks the following:

9.5. In cases where the human creator does not own the copyright—for example, because they have assigned it or because the work was made for hire—should they have a right to object to an AI model being trained on their work? If so, how would such a system work?

I don’t understand the intent of this question. A work in copyright is protected until its term of protection expires. The rights attached to that work may be infringed at any point during the term of protection, and it makes no difference whether the rights are owned by an entity under the work made for hire doctrine or if the rights have been transferred by agreement, inheritance, sale, etc. The question of whether AI “training” constitutes infringement is in no way affected by the status or nature of the copyright owner of the work(s) used.

Unfortunately, this question provided the CCIA with a basis to respond thus:

If a right to object to the use of a work for hire existed, it would belong to the employer. However, given the volume of copyrighted works owned by large employers, allowing employers to take this type of action would exclude large swaths of data that would aid in technological progress and the quality of AI systems and create significant barriers to entry for small entities wishing to develop new AI technologies.

The “right to object” to the use of works in AI “training” may be decided in instances like the surviving claim in Andersen. Meanwhile, CCIA’s broader argument appears to be that the potential cost of doing business should inform the threshold question of copyright infringement. No doubt, AI developers would like unlimited access to free materials, but this “don’t stop the innovation” argument is not a legal question; it is a hackneyed retread of the utopian claim that copyright enforcement online will stifle the “free flow of information.”

Well, whatever is freely flowin’ out there, I wouldn’t necessarily call it information, and against that background, I see no reason to give AI developers carte blanche to exploit creators (again) for the sake of innovation that may not be progress.

[1] Football bat is a military expression for an improvised, cobbled-together tool.

[2] i.e., The complaint cannot be amended and refiled.