Recent AI Copyright Lawsuits Are About More than Compensation for Authors

Last week, writer and broadcaster Andrew Keen invited me to his podcast Keen On to talk (of course) about artificial intelligence. When we got to the subject of the New York Times lawsuit against Open AI and Microsoft, I noted that 1) it is arguably the strongest copyright case presented to date against an AI developer; 2) that it would likely result in a substantial licensing deal between the parties; and 3) that it is hard to say what any of this means for journalism going forward. On that same subject, nonfiction authors Nicholas Basbanes and Nicholas Gage filed a class action suit against Open AI and Microsoft on January 5, just over a week after the Times suit was filed.

As discussed in other posts, although generative AI unequivocally poses a threat to authors and authorship, U.S. copyright law is, oddly enough, not quite designed to address the full scope of the social, economic, and cultural challenge of that threat. While this seems counterintuitive, the difficulty lies in the fact that copyright promotes authorship by protecting works against specific means of infringement, and the nail-biting question of the moment is whether “machine learning” (ML) with the use of protected works violates the reproduction right (§106(1)) of the Copyright Act.

Here, the Times case is strong because the news organization presents compelling, side-by-side evidence that its published stories are being output by ChatGPT almost verbatim. This is evidence that not only is reproduction occurring in the AI model, but that the outputs provided to users serve as a substitute for legal access to the Times’s material. The evidence of reproduction establishes a solid claim of infringement, while the evidence of substitution goes against Open AI’s putative fair use defense. In fact, it was the same circuit (the Second) which held that a news service called TVEyes was “slightly transformative” but that it made so much of Fox News’s material available, even in segments, that the substitutional purpose doomed its fair use defense.

Unlike the Times, the nonfiction book authors do not present side-by-side evidence of verbatim copying of their published writings, and this is consistent with some of the other class-action suits. These are the real nail-biter cases, in my view, because the plaintiffs’ cause is just, but their proof of copyright infringement is less demonstrable than the Times (or the Concord v. Anthropic case for that matter). But this focus on both The New York Times and nonfiction authors raises a serious question as to whether AI will exacerbate the already dismal state of information in the information age.

When the early work of this blog started in 2011, one of the issues of concern was the volume of mediocre, careless, or inaccurate reporting and commentary being promulgated under brands normally associated with quality journalism. Here, it must be said that the Gray Lady herself has not always been immune to the digital-age forces of volume and speed that can drive reporters and editors to engage the market on the lowest rungs. But if the stodgy algorithms of social media have animated a new era of yellow journalism, isn’t it reasonable to assume that certain generative AIs will make matters worse? The internet has already fostered more misinformation than a democratic society can safely endure.

If we consider the possible outcomes of the Times lawsuit, one would be that Open AI changes the model to avoid infringing reproduction. While this may satisfy from a copyright perspective, one wonders about the quality and/or purpose of the information being provided by a tool like ChatGPT.  The output of an LLM is the result of probability. The user asks a question (a prompt), and the AI responds that in all likelihood, based on the information fed into an algorithm, this is what you want to know.

It is no wonder the system to date reproduces material verbatim from a major news organization, but if it doesn’t do that, what should it do? Or what can it do that can be called “progress” with regard to news and information? Take a multi-faceted, extremely emotional topic like Israel and Palestine, train an AI on all the solid reporting, all the mediocre editorials, and the cacophony of opinions on social media, and the user of the LLM gets…what? Why would the results be more informative or thoughtful than the veteran journalist doing her best?

Why won’t an AI be worse than “recommendation algorithms?” If YouTube and Facebook foster confirmation bias and shepherd people onto the wild grazing fields of organically grown conspiracies, it seems rational and prudent to assume that an LLM will do the same thing more efficiently. Why have an old-school search engine point you toward a bogus article linking vaccines to autism when you can have a “dialogue” with an ersatz intelligence on the same topic?

Although the nonfiction book authors do not present the kind of evidence of copyright infringement the Times exhibits in its complaint, the facts presented about the authors’ investment of time, expertise, and money makes a point that should be read as more than a mere plea for sympathy. This is not just about job loss for future historians but quite possibly about the loss of history itself.  From the Basbanes et al. complaint:

The archive of primary research materials assembled by Mr. Basbanes in support of his work over a period of forty years, when acquired by Texas A&M University in 2015, filled 365 packing boxes with documents, transcriptions, drafts, field notebooks, photographic negatives, and the like, all acquired by Mr. Basbanes in pursuit of his literary activities, and at his expense and initiative.

It is more than a legal (i.e., fair use) question whether the purpose of a model like ChatGPT is to make new and relevant use of all that work, or whether its purpose is to supplant the historian and the reporter by “feeding off the sere remains of the past,”[1] until it eventually starves. In the former case, licensing and collaborating with authors and journalists seems reasonable, in the latter case, allowing certain generative AIs to die on the vine seems imperative.


[1] From Ralph Waldo Emerson’s speech at Harvard calling for an American literary independence, August 31, 1837.

Photo by: Antonio83

Things We Don’t Need: Generative AI

When I was planning to start The Illusion of More, I contemplated a category of posts under the heading We Don’t Need This. Although abandoned, I thought it might be an editorial framework for articles about innovations that really aren’t innovative, and the low-tech invention that originally inspired the idea was the kiddie-car/shopping-cart hybrid. In case you haven’t had the pleasure, this vehicle enables a small child to “drive” a plastic car attached to the basket one pushes through the supermarket. As the parent of a small child (at the time IOM was launched), I found this innovation was a terrible idea—one that demanded use the moment the child laid eyes upon it, but which mostly offered poor maneuverability through the aisles and unnecessary geometric struggle at check-out.

There is, of course, nothing connecting the kiddie-car/shopping-cart to generative AI except, in my view, the fact that we don’t need either one. Or at least, we don’t need most of what generative AI appears to be doing, and this is perhaps the most maddening aspect of the most prominent generative AI tools making the headlines—that they serve no purpose and, if we’re getting all IP about it, promote no progress. I’ve said it, and I’ll keep saying it:  we do not need computers to make artistic works.

This month, the Federal Trade Commission (FTC) issued a report describing its early findings about AI’s potential harms which may be addressable under the agency’s purview. Charged with enforcing prohibitions against unfair, non-competitive business practices and protecting consumers, the FTC hosted a roundtable discussion with members of the creative community to hear their concerns about both the development and public deployment of generative AIs. As the report states:

Various competition and consumer protection concerns may arise when AI is deployed in the creative professions. Conduct–such as training an AI tool on protected expression without the creator’s consent or selling output generated from such an AI tool, including by mimicking the creator’s writing style, vocal or instrumental performance, or likeness—may constitute an unfair method of competition or an unfair or deceptive practice.

In response to the report—specifically to the passage quoted above—three well-known copyright critics, Pamela Samuelson, Matthew Sag, and Christopher Sprigman (SS&S) criticized the FTC “both for its opacity and for the ways in which it may be interpreted (or misinterpreted) to chill innovation and restrict competition in the markets for AI technologies.” Before responding to that allegation, I must indulge in a little gallows humor and mention that the economic and global-security leader of the free world is in danger of shredding its Constitution, going full-tilt authoritarian, and spiraling into a deathroll of ignorance and cruelty. And yet, we’re going to talk about “chilling innovation” in generative AI as if it’s a matter of urgency. The world is in crisis, and billions have been invested to see who can do the best job getting a computer to write a poem or make a picture? Talk about whimpers instead of bangs.

There are two reasons that sentiment is not raw Ludditism. The first is that it does not dismiss all AI development in the creative industry as useless; and the second is that the “copyright stifles innovation” bullet point is a generalization that should never be uttered again—especially in light of its direct role in fostering the above-mentioned prospect of democracy’s collapse. We’ve heard all this before—specifically from SS&S and their colleagues in academia and the “digital rights” organizations. We’ve been told that copyright stifles the free and open internet, access to information, and the speech right.

But in addition to the fact that the premise itself was false, the grand social media experiment in the “democratization of everything” must be recognized as an abysmal failure, and its cheerleaders should muster the humility to stifle their tiresome and dangerous refrains in context to AI. Social media companies and their friends in academia—and here, I must include President Obama’s Google-friendly administration—share considerable blame for the heedless, tech-enabled populism that has fostered so many social hazards, including a literal seditionist now leading one of America’s two political parties.

Notably, the FTC report does not mention copyright very much, and in fact, many of the creative professionals who participated in the discussions acknowledged that because they are not copyright owners (e.g., voice actors and screenwriters for hire were among the representatives), they do not have rights currently protecting them against generative AI resulting in the kind of unfair outcomes, which the FTC is charged with mitigating. It would take too long a post to respond to all the critiques presented by SS&S, but I wanted to focus on this statement:

We are concerned especially about the suggestion in the FTC’s Comments that AI training might be a Section 5 violation where it “diminishes the value of [a creator’s] existing or future works.” A hallmark of competition is that it diminishes the returns that producers are likely to garner relative to a less competitive marketplace. This is just as likely to be true in markets for creative goods, such as novels and paintings, as it is in markets for ordinary tangible goods like automobiles and groceries. AI agents that produce outputs that are not substantially similar to any work on which the AI agent was trained, and are thus not infringing on any particular copyright owner’s rights, are lawful competition for the works on which they are trained.  Surely the FTC does not plan to have Section 5 displace the judgments of copyright law on what is and what is not lawful competition?

To summarize, that paragraph declares that it does not matter if generative AI displaces human authors, that in fact, it is a threshold we should be eager to cross. Notwithstanding the fact that two of the high-profile lawsuits present compelling evidence of substantially similar outputs,[1] the more concerning implication of that paragraph is that SS&S endorse the inevitability that generative AI will devalue human creators and/or eliminate them altogether. Moreover, calling this eventuality a form of “competition” reveals an unsettling perspective consistent with every anti-copyright paper I have ever read—namely, that the production of creative works is no different than the production of any other product or service.

I’ve said many times that copyright critics don’t understand artists, and here, the inapt word competition demonstrates why this axiom endures. For instance, publishers are in competition with one another to an extent, but authors are not—at least not in the sense that the concept applies in other industries—least of all Big Tech. No novelist, for instance, wants to hold the undivided and exclusive attention of all readers the way Meta wants eyeballs never to stray for long from its platforms. Artists thrive in a diverse market of other artists, consumers benefit as a result, and copyright is an engine of that diversity, not a barrier to it. Artists may feel competitive or jealous at times, or even behave in a competitive manner (because they’re human), but the reality is that they need one another to exist at a scale that is not comparable to other “businesses.” True to form, copyright critics like to cite the interdependence of authors to highlight copyright’s limitations but then ignore the same principle in support of tech giants swallowing all creative enterprise whole.

The primary concern expressed by SS&S appears to be that the FTC alleges that AI training with copyrighted works is an act of infringement. Unsurprisingly, this same trio submitted comments to the Copyright Office arguing that AI training with protected works is fair use, but as that very question is already presented in several court cases, I assume SS&S are primarily concerned with optics here. The trio states, “The FTC has no authority to determine what is and what is not copyright infringement, or what is or is not fair use. Under governing law, that is a judicial function.”

Exactly. And the question is now before the courts. So, what’s the problem? That the FTC should not even raise the issue? According to tweets by Samuelson and Sprigman, they argue that the FTC’s report is one-sided, that it is too creator-focused and does not account for the testimony or opinions of the technology companies developing AI. But while I certainly agree that multistakeholder hearings etc. are the proper approach to developing new policy, it is impossible to tolerate a complaint about lack of balance coming from the anti-copyright crowd at all, and from these individuals in particular. For instance, readers may not remember the American Law Institute Restatement of Copyright, initiated by Samuelson and led by Sprigman, but critics of the project—some of the most prominent names in copyright scholarship—specifically cite the opacity of the restatement process and deafness of its managers to the concerns and recommendations of their colleagues.

More broadly, it must be said that if, indeed, the FTC lately gave more attention to the creators than they did to the tech companies, then this was a long overdue anomaly. Between at least the mid-late 1990s and 2016, the tech companies were treated with kid gloves, handed the keys to Washington, and feted like the economic and democratic engines they claimed to be. Since 2016, sentiment began to swing in the other direction, as many Americans began to see how disinformation plus data manipulation can become a wrecking ball for a whole society.

If Big Tech lost the previously undeserved benefit of the doubt, good. AI has the potential to exacerbate many of the same Web 2.0 harms at unprecedented speed and scale, and if the FTC, the USCO, the courts, or Congress look askance at the developers, then it is a mistrust well earned. And again, at least with regard to generative AI designed to make creative works, none of the parties empowered to write policy in this area should forget the bottom line:  that when it comes to producing creative work, we truly do not need generative AI.


[1] Concord et al. v. Anthropic and NYT v. Open AI, et al

SEE ALSO: The Washington Post reported this month that Big Tech continues to significantly fund and influence academia in these policy areas.

Photo by: Jollier

The Age of the Mouse is Nigh!

And the fairy that is called Tinkerbell said, come and see. And I saw, and behold, a Mouse with large black ears. And the name that said on him was Mirth. And Joy followed with him.

And to those who may feel anxious about the coming year, I say unto thee, fear not. Whatever your concerns for the fate of the world—however well-founded—take comfort. For a new era begins at Midnight on the first day of the year 2024. Of course, I speak not of the Lamb of Revelations, but of the Mouse of Disney, and the day when the Copyright seal will be broken, and Mickey, in the form of Steamboat Willie, will rise (not fall) into the Public Domain. Rejoice! Let the Angels sound their trumpets! For the Age of the Mouse is upon us. And Mankind shall be saved.

But perhaps you think I exaggerate. Verily, you say, the passing of a cartoon character out of Copyright cannot bring about an era of new enlightenment and goodwill. Indeed, I was tempted to believe as you do. But like most Men of ordinary sense, I was not blessed with the vision of the Prophets—those sages, who read from the Book of Lessig, and proclaim that since the year 1998, Man has robbed himself of his own Culture, keeping the most sacred expressions in a Babylonian bondage called the Copyright Term Extension Act. Yea, though the Prophets bore false witness and beguiled the People, saying that the Act of Sonny Bono and the other Philistines was made unto law for the wicked purpose of keeping the Mouse in bondage, let us not quibble over the petty facts of History.

For on this New Year’s Day, the Mouse shall be set free, and the People will speak His name, and he shall say unto the first of his Disciples, Come on, Pluto! And Pluto will go on. He will follow the Mouse. And more Disciples will come and see. And the People will see and hear. And again, the Mouse will say, Come on! And the People will go on. For the Mouse shall then belong to all the People. Or at least, in a limited sense, this will be so.

And the days of the New Year, and all the years that follow, shall be known as the era of Mickey Remix. And the Remix will sweep across the Earth as like a gentle breeze, and Man shall come to know his own folly, and he shall be as though reborn. He will lay down the arms of war and abandon the politics of hate. And reason, compassion, and knowledge will, at last, be the hallmarks of civilization. Verily, these things must come to pass. For if the events foretold do not transpire with the ascension of the Mouse to the Public Domain, then the Prophets are indeed false and deserving of scorn.

If the Mouse merely passes into the realm beyond the copyright term, and Man remains in the same state of ignorance and peril as on this day, then it shall be known that the Prophets are deceivers. If on the second day of the New Year, and all the days that follow, Man is much the same as before, then the Prophets have dissembled and have wasted more than twenty-five times three hundred sixty-five days peddling mere trivia as wisdom. We shall know soon. For the day of reckoning is nigh. The grace of the Mouse be with you all.