Copyright and AI in a World of Whiplash Public Policy

copyright

I have not added a copyright post here since March 19, when the DC Circuit Court of Appeals affirmed in Thaler v. Perlmutter that works produced autonomously by generative AI (GAI) are not protected under U.S. copyright law. Although it is good to see the human authorship doctrine in copyright left undisturbed, it is a fleeting moment of sanity within a warped national reality.

As reported earlier, Open AI appealed to the administration’s focus on China as a basis to argue that “beating China” requires ignoring the copyright claims of authors whose works are used to train AI models. Not only is that claim wrong on it’s face, but the conduct of the current administration vis-à-vis civil rights forces millions of Americans to ask whether China is an adversary or a role model.

One mirror in the funhouse reveals a compelling bipartisan hearing held by the Senate Judiciary Committee, Subcommittee on Crime and Counterterrorism, where Chairman Hawley and colleagues from both parties offered strong endorsements for the courageous testimony of Facebook whistleblower Sarah Wynn-Williams. Focused primarily on Meta’s engagements with the Chinese Communist Party (CCP)—and Zuckerberg’s lying to Congress about that very issue—the committee cited other abuses described in Wynn-Williams’s book, like the company intentionally targeting vulnerable teens. (More about the book Careless People in another post.)

Ordinarily, I compartmentalize copyright matters from other criticisms of Big Tech, but here, the stories overlap, even if Meta is the only target of the committee’s investigation at this time. First, throughout her testimony, Wynn-Williams repeats the theme that Meta used the “but China will win” argument to oppose Congress taking any meaningful regulatory action. This alone should cast doubt upon Open AI et al. making the same argument as a rationale for mass copyright infringement for model training. As Senator Klobuchar noted, there was no basis for prior claims that enforcing various consumer safeguards (e.g., Kids Online Safety Act) would be counter-productive to national security, and in that light, Congress should decline to believe the same story in regard to copyright infringement.

Meta may be unique—or uniquely situated—as a clandestine partner to the CCP, but it is also notable that the committee mentioned the role of Meta’s Llama AI and heard Wynn-Williams’s testimony that the product was used by the CCP for “AI weapons” and for the development of the Chinese LLM DeepSeek. Further, Wynn-Wiliams offers a theory about the open source versus closed model AI competition in the marketplace. “There’s a lot of money on the line,” she says. “In some ways you could say, if you want open source to prevail, it helps to have a strong threat from a Chinese model so you can say that it’s really important that America wins, and we’re the American open-source option. And I think you can see the way that strategically plays out.”

“But China will win” is pretty much what Open AI told the Office of Science and Technology Policy in its letter arguing that machine training with copyrighted works is per se fair use. But looking at Meta (which is currently being sued in the Kadrey case), consider the perspective:  in developing Llama, not only did Meta scrape the literary works of millions of authors and journalists, and not only did it source pirate libraries for that purpose, but it also deployed that same AI power in the interests of a nation that brutally kills freedom of expression. Yes, of course, I’m thinking the same thing because it’s unavoidable. The current U.S. administration has engaged in multiple First Amendment and other constitutional violations, including assaults on the free press, and thus, the policy whiplash.

Couple these optics with the volume of evidence that the real power behind the destruction of the administrative state is a small group of tech billionaires pushing an anti-democracy ideology called the neo-reactionary movement (NRx), and the idea of advocating creators’ rights seems all but futile. After all, is it remotely sane to think that an administration of semi-literate, 1A-infringing, book banners will care about the rights of authors—let alone reject the tech-bros who wrote the destruction manual for the United States?

Setting aside the copyright questions raised by GAI training, Big Tech’s wanton harvest of artistic and intellectual works as lifeless raw material is perhaps the ultimate expression of the cyberlibertarian’s disdain for human beings as mere repositories of data to be exploited and manipulated. The rhetoric of Big Tech ideology—from 4Chan to the halls of academia—is the authoritarian principle that individuals must be sacrificed for the sake of the collective. All rights are a nuisance to the tech oligarch, and authors are the last people any authoritarian wants to empower.

Open AI’s claim that mass copyright infringement is necessary to “beat China” is paradoxical—either willfully or naively blind to the fact that when we treat works of authorship as mere fodder for the machine, we don’t beat the CCP; we emulate it. Further, not only is the claim overstated that GAI development is a matter of national security, but again, what does “national security” even mean at present? Concepts like American interests, values, innovation, global security, etc. are all diminished, if not wholly swallowed, by the reckless destruction of the principles and institutions that distinguish America as a leader among democratic nations. And copyright rights are in those same crosshairs.

In response to copyright’s critics, especially those in academia with Big Tech funding their work, I have argued that the diversity and scope of America’s creative output has been essential to its strength as a democracy. Whether one looks at the economic value of the core copyright industries, the cultural value of diverse creative expression, or both, the rationale for intellectual property is to incentivize useful innovation and legitimate greatness.

American authors—from historians to rockstars—are the legacy of an aspiration expressed by Noah Webster, the father of American English and of American copyright. In 1783, advocating the first state copyright law in Connecticut, Webster argued that “America must be as independent in literature as she is in politics—as famous for arts as for arms.” By contrast the “greatness” proclaimed by Trump is tautological and brittle just like Big Tech’s claims to “innovation” are often vague and misleading.

As proposed in my book, the inclusion of copyright in Article I was one of the more egalitarian and democratic choices made by the founders, even if they did not wholly grasp its potential. At the most basic level, copyright incentivizes creative expression by any citizen anywhere, and the American model largely fulfilled that traditional Republican principle that the market, not the government, decides what is successful.

The copyright questions presented in roughly 40 cases are difficult and novel. Moreover, the facts presented vary, and thus, the outcomes will vary, especially on questions of fair use. In the meantime, it is clear that at least some of the major AI developers are engaged in a campaign to appeal to the current administration to treat copyright rights much as it is treating other constitutional rights—as principles to trample in a march toward something very un-American.

Big Tech Tells Trump Admin that Copyright is a Barrier to AI Development

copyright

Last week, in response to the Executive Order referred to as the “AI Action Plan,” various stakeholders submitted comments to the Office of Science and Technology Policy (OSTP). OpenAI, for its part, submitted one of the finest examples of tech-bro bombast we have seen in some time. Not even Google’s comments, which names copyright, privacy, and patents as barriers to AI development, comes close to OpenAI for serving up so much high-octane, tech-utopian gibberish, including this gem in the preamble:

As our CEO Sam Altman has written, we are at the doorstep of the next leap in prosperity: the Intelligence Age. But we must ensure that people have freedom of intelligence, by which we mean the freedom to access and benefit from AGI, protected from both autocratic powers that would take people’s freedoms away, and layers of laws and bureaucracy that would prevent our realizing them.

Fewer than half of all Americans trust either the current administration or Big Tech when it comes to “freedoms” or “intelligence,” but does anyone believe that AI development inexorably leads to the kind of prosperity OpenAI projects in its comments? Like most technologies, AI can be used for good or evil. In theory, it can be used to diagnose and treat disease, but in practice, it could be used to “solve” disease by more efficiently automating denial of treatment. It can be used to enhance or improve productive work, but it might be used to shed jobs across multiple sectors without considering the implications of doing so.

“Innovation” is a meaningless word until it is defined by the values and principles of the innovators and/or the government with which the industry partners. In OpenAI’s effort to distinguish American AI development from that of the People’s Republic of China (PRC), it recommends, at least in its comments on copyright, that we should emulate the anti-democratic, piratical conduct of this adversary. It even goes so far as to allege without foundation that machine learning (ML) with unlicensed copyrighted works is a matter of national security.

Under the heading “Freedom to Learn,” OpenAI’s comments about copyright—especially the emphasis on fair use doctrine—are incoherent to the point that one wonders whom the company is addressing. But before speculating about that question, here are a few quotes with responses:

American copyright law, including the longstanding fair use doctrine, protects the transformative uses of existing works, ensuring that innovators have a balanced and predictable framework for experimentation and entrepreneurship.

The judge-made fair use doctrine applies a four-factor test, of which one part of the first factor considers whether a “transformative use” has been made of a protected work. There is no direct precedent applicable to mass copying of creative works for the purpose of ML to build artificial intelligence, which is why about thirty active lawsuits present this novel question to the courts. Further, because fair use is a case-by-case, affirmative defense to a claim of infringement, it defies the “predictable frameworks,” for which OpenAI claims to be asking.

This approach has underpinned American success through earlier phases of technological progress and is even more critical to continued American leadership on AI in the wake of recent events in the PRC.

This says, “American innovation is great, but the Chinese kicked our asses with DeepSeek, and we’re grumpy about it.” Kudos to OpenAI for playing to the audience, but it is incoherent as a statement about the fair use defense “underpinning American success.” The core copyright industries account for an estimated 7.66% of U.S. GDP and this proven prosperity should not be radically disturbed for the sake of undefined “innovation,” some of which will inevitably flop.

As for history, American copyright law has typically adapted to technological change by ensuring the protection of authors’ rights from the exigencies of technology developers. In the best cases, this fosters a symbiotic relationship between new technology and creators, but that is not what OpenAI advocates here. Instead, OpenAI says, “American creators be damned. AI is too important to worry about their rights.”

OpenAI’s models are trained to not replicate works for consumption by the public. Instead, they learn from the works and extract patterns, linguistic structures, and contextual insights. This means our AI model training aligns with the core objectives of copyright and the fair use doctrine, using existing works to create something wholly new and different without eroding the commercial value of those existing works.

This attempt to litigate questions of fact and law in comments to the OSTP is as contradictory as it is misplaced. First, it asserts that OpenAI’s ML process does not violate any copyright rights and is, therefore, non-infringing. But that assertion conflicts with the inapt argument that their ML is exempted under factors one and four of the fair use test. Where there is no basis for a claim of infringement, there is no rationale for arguing a fair use defense.

Applying the fair use doctrine to AI is not only a matter of American competitiveness—it’s a matter of national security. Given concerted state support for critical industries and infrastructure projects, there’s little doubt that the PRC’s AI developers will enjoy unfettered access to data—including copyrighted data—that will improve their models. If the PRC’s developers have unfettered access to data and American companies are left without fair use access, the race for AI is effectively over.

Here, OpenAI argues that American policy should emulate the PRC by disregarding the rights of creators, thereby, disqualifying any claim by Altman & Co. to promote democratic values. Further, OpenAI not only invents the term “fair use access” but then erroneously implies that U.S. national security operations need the “freedom to learn” from unlicensed creative works in order to do their jobs.

For Whose Eyes?

The combination of misstatements and emphasis on fair use prompts the question as to what policy OpenAI hopes to achieve. If OpenAI et al. want a statutory exception for ML, any rational petition to Congress for that change to the Copyright Act would not address fair use or suggest amendment to that part of the statute. Instead, we must assume that this message is aimed at the courts, who will decide whether and to what extent ML is exempted by fair use, including in cases where OpenAI is a defendant.

Presumably, one hope is to say the words “national security” enough times that 1) some party in the administration echoes this talking point; and/or b) the courts feel reluctant to rule against AI developers on copyright infringement claims. In either case, AI is not one product. Development of security-related products or AI agents for the intelligence community does not rely upon the development of those generative AI models that are built substantially on ingesting millions of creative works without license for the purpose of producing artificial “creative” works.

More broadly, it is a tad rich to say that copyright rights are a barrier in the AI arms race while DOGE is assigned to hack its way through educational funding and shed experts in nearly every field. If America loses to China in this contest, it will most likely be attributable to our national retreat from excellence and fostering a culture where people refuse to see the difference between a Ford F-150 and a plastic piece of shit. If that’s the kind of public/private environment in which Americans are going to develop AI, don’t blame the artists and their copyright rights when it fails.


Photo by pylypchukinnastock358

Gen AI & the Hubris of Data

data

In almost every discussion I’ve had with creators about generative AI (GAI), I have said that we should not overlook Big Tech’s capacity for exaggeration and total flops. Because it is possible that AI products may be the next Google Glass due to cultural and/or economic forces that reject their business models. For instance, last week, Digital Music News (DMN) announced a partnership between Amazon and the AI music product Suno for the next generation of Alexa+. DMN quotes Amazon’s Panos Panay, SVP of Devices and Services thus:

Using Alexa’s integration with Suno, you can turn simple, creative requests into complete songs, including vocals, lyrics, and instrumentation. Looking to delight your partner with a personalized song for their birthday based on their love of cats, or surprise your kid by creating a rap using their favorite cartoon characters? Alexa+ has you covered.

The first time I read about Suno, it struck me as a gimmick that may not attract or sustain enough market interest to be profitable. Just the example cited above of making personalized birthday songs seems like the kind of thing a household can only do a few times before it gets stale. “Surprise your kid by creating a rap…” sounds like what the kids calls “cringy.” But the broader question posed by Suno is whether consumers want “personalized” music, or whether the whole concept is the just another hubristic statement about the power of data in the arts.

There have been many arguments presented by theorists and scholars that consumer data either obviates the need for creators’ rights (copyrights) or justifies substantially limiting those rights. The general premise is that if consumer data informs creators about what audiences want, this insight lowers the risk of investing in production. Lowering that risk, say the theorists, implies rethinking copyright protection—or even rethinking the nature and value of creators, as Professors Sprigman and Rustalia proposed in a paper I critiqued in 2018.

As argued in that criticism and elsewhere on this blog, the goal of artists and creators is not necessarily to give audiences what they want. While one cannot dispute the market value of certain “formulas,” there is substantial evidence that when producers strive too hard to meet audience expectations, audiences are often disappointed. In short, risk is inherent to creative expression and audience experience.

In every medium and every genre, consumers want to be surprised by artists, and shifting modes of expression reflect artists’ personal responses to contemporary events. In general, the most successful (i.e., meaningful) works are the ones we didn’t know we wanted until we had them. And once these works become part of the vernacular of our lives, we cannot imagine living without them.

By contrast theories about the power of data as a predictor of creative success are founded in a techno-centric arrogance that, to me, is exemplified in a product like Suno. The idea that the consumer wants music to be tailored from a few instructions—“Alexa make me a punk rock song about a guy who lost his job.”—is typical of the kind of “innovation” many technologists would develop by ignoring fundamental reasons we enjoy music in the first place.

As explored in this post about opera, I agree that music, and other expressive media, can be replicated by an AI to provoke emotional responses in human observers. Simply put, if a composer knows that minor chords have a certain effect on the Western listener, then an AI can follow the same rule to produce a “melancholy” tune. But the science of music and human psychology only explains our instinctive, animal-like responses to combinations of sounds while leaving out the rest of the experience.

We cherish our playlists for reasons that transcend the sounds’ effects on our brains—i.e., transcend mere taste. We relate and return to artists or their messages; we store and recall memories in the songs we replay; and we connect to friends and family through songs we have in common. Suno, outputting a bespoke song like a tepid cocktail cannot provide any of that. On the contrary, it omits all those aspects of music that make us care about it, suggesting that its outputs are indeed gimmicks destined to become as dull as they are disposable when the short-lived novelty wears off. At least that’s my prediction.

There is, of course, a more insidious question worth asking—namely whether a product like Suno, especially when paired with Amazon, is less significant as a custom jukebox than it is as a new surveillance device. The use of personal data to micro-target and manipulate people and alter the course of major world events is not science fiction anymore. In that light, is it not conceivable that, say, 100-million people expressing their sentiments to an AI “music composer” will add color to data that will only exacerbate surveillance capitalism? That would be one hell of a way to pervert music.


Photo by: Cm2012