Let’s Stop Analogizing Human Creators to Machines

Just as it is folly to anthropomorphize computers and robots, it is also unhelpful to discuss the implications of generative AI in copyright law by analogizing machines to authors.[1] In 2019, I explored the idea that “machine learning” could be analogous to human reading if the human happens to have an eidetic memory. But this was a thought exercise, and in that post, I also imagined machine training that serves a computer science or research purpose—not necessarily generative AIs trained on protected works designed to produce works without authors.

In the present discussion, however, certain parties weighing in on AI and copyright seem to advocate policy that is premised on the language and principles of existing doctrine as applicable to the technological processes of both the input and output sides of the generative AI equation. Of course, policy discussions usually begin with the existing framework, but in this instance, it can be a shaky starting place because generative AI presents some unique challenges—and not just for the practice of copyright law.

We should be wary of analogizing machine functions to human activity for the simple reason that copyright law (indeed all law) has never been anything but anthropocentric. Although it is difficult to avoid speaking in terms of machines “learning” or “creating,” it is essential that we either constantly remind ourselves that these are weak, inaccurate metaphors, or that a new glossary is needed to describe what certain AIs may be doing in the world of creative production.

On the input (training) side of the equation, the moment someone says something like, “Humans learn to make art by looking at art, and generative AIs do the same thing,” the speaker should be directed to the break-out session on sci-fi and excused from any serious conversation about applicable copyright law. Likewise, on the output side, comparisons of AI to other technological developments—from the printing press to Photoshop—should be presumed irrelevant unless the AI at issue can plausibly be described as a tool of the author rather than the primary maker of a work of creative expression.

Copyright Office Guidance Highlights Some Key Difficulties

To emphasize the exceptional nature of this discussion, even experts are somewhat confused by both the doctrinal and administrative aspects in the new guidelines published by U.S. Copyright Office directing authors how to disclaim AI-generated material in a registration application. The confusion is hardly surprising because generative AI has prompted the Office to ask an unprecedented question—namely, How was this work made?

As noted in several posts, copyrightability has always been agnostic with regard to the creative process. Copyright rights attach to works that show a modicum of originality, and the Copyright Office does not generally ask what tools, methods, etc. the author used to make a work.[2] But this historic practice was then confronted by the now widely reported applications submitted by Stephen Thaler and Kris Kashtanova, both claiming copyright in visual works made with generative AI.

In both cases, the Copyright Office rejected registration applications for the visual works based on the longstanding, bright-line doctrine that copyright rights can only attach to works made by human beings. In Thaler’s case, the consideration is straightforward because the claimant affirmed that the image was produced entirely by a machine. Kashtanova, on the other hand, asserts more than di minimis authorship (i.e., using AI as a tool) to produce the visual works elements in a comic book.

Whether in response to Kashtanova—or certainly anticipating applications yet to come—the muddiness of the Office guidelines is an attempt to address the difficult question as to whether copyright attaches to a work that combines authorship and AI generation, and how to draw distinctions between the two. This is not only new territory for the Office as a doctrinal matter but is a potential mess as an administrative one.

The Copyright Office has never been tasked with separating the protectable expression attributable to a human from the unprotectable expression attributable to a machine. Even if it could be said that photography has always provoked this tension (a discussion on its own), the analysis has never been an issue for the Office when registering works, but only for the courts in resolving claims of infringement. In fact, Warhol v. Goldsmith, although before SCOTUS as fair use case, is a prime example of how tricky it can be to separate the factual elements of a photograph from the expressive elements.

But now the Copyright Office is potentially tasked with a copyrightability question that, in practice, would ask both the author and the examiner to engage in a version of the idea/expression dichotomy analysis—first separating the machine generated material from the author’s material and then considering whether the author has a valid claim in the protectable expression.

This is not so easy to accomplish in a work that combines author and machine-made elements in a manner that may be subtly intertwined; it begs new questions about what the AI “contributed” to a given work; and the inquiry is further complicated by the variety of AI tools in the market or in development. Then, because neither the author/claimant nor the Office examiner is likely a copyright attorney (let alone a court), the inquiry is fraught with difficulty as an administrative process—and that’s if the author makes a good-faith effort to disclaim the AI-generated material in the first place.

Many independent authors are confused enough by the Limit of Claim in a registration application or the concept of “published” versus “unpublished.” Asking these same creators to delve into the metaphysics implied by the AI/Author distinction seems like a dubious enterprise, and one that is not likely to foster more faith in the copyright system than the average indie creator has right now.

Copyrightability Could Remain Blind But …

It is understandable that some creators (e.g., filmmakers using certain plug-ins) may be concerned that the Copyright Office has already taken too broad a view—connoting a per se rule that denies copyrightability for any work generated with any AI technology. This concern is a reminder that AI should not be discussed as a monolithic topic because not all AI enhanced products do the same thing. And again, this may imply a need for some new terms rather than the words we use to describe human activities.

In this light, one could follow a different line of reasoning and argue that the agnosticism of copyrightability vis-à-vis process has always implied a presumption of human authorship where other factors—from technological enhancements to dumb luck—invisibly contribute to the protectable expression. Relatedly, a photographer can add a filter or plug-in that changes the expressive qualities of her image, but doing so is considered part of the selection and arrangement aspect of her authorship and does not dilute the copyrightability of the image.

Some extraordinary visual work has already been produced by professional artists using AI to yield results that are too strikingly well-crafted to believe that the author has not exerted considerable influence over the final image. In this regard, then, perhaps the copyrightability question at the registration stage, no matter how sophisticated the “filter” becomes, should remain blind to process. The Copyright Office could continue to register works submitted by valid claimants without asking the novel How question.

But the more that works may be generated with little or no human spark, the more this agnostic, status-quo approach could unravel the foundation of copyright rights altogether. And it would not be the first time that major tech companies have sought to do exactly that. It is no surprise that an AI developer or a producer using AI would seek the financial benefits of copyright protection; but without a defensible presence of human expression in the work, the exclusive rights of copyright cannot vest in a person with the standing to defend those rights. Nowhere in U.S. law do non-humans have rights of any kind, and this foundational principle reminds us that although machine activity can be compared to human activity as an allegorical construct, this is too whimsical for a serious policy discussion.

Again, I highlight this tangle of administrative and doctrinal factors to emphasize the point that generative AI does not merely present new variations on old questions (e.g., photography), but raises novel questions that cannot easily be answered by analogies to the past. If the challenges presented by generative AI are to be resolved sensibly, and in a way that will serve independent creators, policymakers and thought leaders on copyright law should be skeptical of arguments that too earnestly attempt to transpose centuries of doctrine for human activity into principles applied to machine activity.

[1] I do not distinguish “human” authors, because there is no other kind.

[2] I say “generally” only because I cannot account for every conversation among claimants and examiners.

Image by boom15th931

Controlled Digital Lending is a Dubious Proposal in Every Sense

On March 24, the court in Hachette et al. v. Internet Archive wholly rejected IA’s fair use defense constructed on the theory called Controlled Digital Lending (CDL). Prior to and since that ruling, various parties have tried to characterize this case as an attack by the publishers against the core function of libraries, alleging that libraries either already depend, or will come to depend, on CDL to meet the needs of communities in the digital age.

It is easy to promote a message that says Library good. Publisher bad. And I get why various people, including policymakers and librarians, might respond to the slogan. But the populist message obscures what a convoluted, if not insidious, proposal CDL truly is. While it may be true that select libraries engage in limited activities, long exempted by statute, which certain vested interests now describe as akin to CDL, it is erroneous to suggest that CDL, as envisioned by its proponents, is inherent to library operations. On the contrary, it is a complicated and expensive proposal—even if it were legal.

The CDL theory, based on ideas first proposed by Professor Michelle Wu (Georgetown University), is fleshed out and advocated in a 2018 white paper written by Kyle Courtney (Library Futures Chair) and David R. Hansen (Authors Alliance Executive Director). According to their reading of the fair use doctrine in conjunction with first sale doctrine,[1] Courtney and Hansen argue that libraries are legally permitted to erect their own ebook lending models by digitizing and then loaning digital books based on the number of legally obtained physical copies in the collection.

On its face, the concept sounds fair-minded and progressive—hypothetically adding new digital access while allowing the library to bypass (i.e., not pay for) current ebook licensing/lending regimes like OverDrive. And according to the theory, CDL will not disrupt the authors’ interests because it purports to maintain, rather than alter, longstanding copyright doctrine. Who wouldn’t endorse that from the sound of it? Candidly, someone who is not well-versed in copyright law or contemplating the practical implications of the CDL model.

Sparing readers a detailed breakdown of the legal constructs in the 42-page white paper, suffice to say, the keystone argument—a fair use defense riding on the first sale doctrine—was unequivocally rejected by the court in Hachette last month because the central points had already been made and rejected by this same circuit in contemporary cases.[2] In fact, CDL proponents may not be thrilled that Internet Archive was the first (and perhaps the last) institution to represent their theory in court because, even with millions in revenue, IA failed to implement the “controlled” part of the model.[3]

This begs an important question for libraries: if IA is their Galahad in the quest for CDL, why does it fail operationally to implement the model? That the underlying legal theory would fail was hardly in doubt, and this alone should doom CDL as a consideration for any library. But it is further notable that, even if CDL were legal in some form, implementing it would likely be more costly than the current ebook lending regime the library would be circumventing.

CDL Would Not be Free or Liability Free

Launching a CDL model, as set forth in the white paper, implies considerable expense, requiring either a library-developed system or paying to use a system developed by a third party. Presumably, the CDL folks imagined that Internet Archive would be that third party, but as that organization failed to adhere to the controls in the model, this should prompt librarians to consider what it would cost to adopt “real” CDL, and for what purpose.

Without addressing the practical implications of a holistic, auditable CDL system, proponents appear to recommend that libraries invest substantial resources in a new, complex model to manage physical and digital book lending and then wait to see if it gets sued. Because, astoundingly, the white paper contains a whole section advising libraries as to how they might limit risk when implementing CDL. It must be nice to sit in an office at an elite law school, devise a hypothesis that some proscribed conduct is “legal,” and then suggest somebody else try it to find out. And all this fuss, cost, and opportunity cost is to circumvent existing models that make ebooks available for about a dollar or less per loan?

The Future of Libraries is Not About eBooks

Finally, it cannot be ignored that the sustainability of libraries does not lie in providing more access to digital books and other materials via websites. Libraries are physical spaces that play important and diverse roles in each community, and their future depends on maintaining relevance as physical spaces operated by professionals with certain skills and sensitivities to local needs. Whether that means story time for children or hosting career counselors for adults or a thousand other initiatives, digital book lending is not a community connecting activity any more than shopping on Amazon is a social experience.

If ebook loans become too prominent a feature of a library system, those physical spaces and professional librarians will no longer be needed (i.e., funded). And in case it isn’t obvious by now, digital platforms tend to swallow independent institutions. Much like internet consolidation has nearly exterminated the local and independent newspaper, a similar consolidation of reading material into a more centralized, globally accessible network (as envisioned by Internet Archive’s Brewster Kahle) would be fatal to the local library as a lending institution.

Libraries should spend their limited resources on building and maintaining personal relationships with communities rather than waste time with complicated and erroneous workarounds to copyright rights. Frankly, the well-funded academics and organizations peddling CDL would do more good for libraries if they just hosted a damn bake sale.

[1] Specifically, the paper argues that factor one of the fair use test favors CDL because its “purpose” is to fulfill the intent of the first sale doctrine—and then, they argue this is further bolstered because libraries are not commercial entities.

[2] e.g., ReDigi, TVEyes.

[3] For instance, the CDL paper does not envision an unaccountable system whereby physical books are stored in shipping containers as the basis for digital copy loans. Internet Archive does this.

Photo by: JackF

Podcast – Tech Designer Carla Diana

This year’s World IP Day theme celebrates Women and IP: Accelerating Innovation and Creativity, and for that reason as well as the fact that artificial intelligence dominates all topics these days, my guest for this episode is the highly innovative Carla Diana, whom I first interviewed in 2014.

Carla is a tech designer, author, and educator. She runs the 4D design program at the Cranbrook Academy of Art in Michigan; she is the lead designer at Diligent Robotics in Austin, Texas; and she is the author of dozens of articles and essays about technology and design. Her most recent book, published in 2021 by Harvard Business Review Press, is My Robot Gets Me: How Social Design Can Make New Products More Human. And we’ll talk about what that means, plus generative AI, driverless cars, ethics in technology, and at least one product I had not imagined was a thing.

Show Contents

00:01:24 – Carla’s background.
00:05:57 – Why good design is social.
00:11:55 – Design modalities & thinking about consumers with disabilities.
00:20:27 – That tech should not mimic human behavior.
00:28:57 – On avoiding innovation for its own sake.
00:36:07 – On ethics in technology.
00:45:51 – Generative AI and the arts.
01:00:55 – Tech solutions for tech problems (e.g. Glaze for visual artists).
01:05:32 – Self-driving vehicles.
01:09:30 – Economic & social implications of a driverless world.
01:15:26 – Combining design and ethics.

The Illusion of More

Dissecting the digital utopia.

Category: Art