Gen AI & the Hubris of Data

data

In almost every discussion I’ve had with creators about generative AI (GAI), I have said that we should not overlook Big Tech’s capacity for exaggeration and total flops. Because it is possible that AI products may be the next Google Glass due to cultural and/or economic forces that reject their business models. For instance, last week, Digital Music News (DMN) announced a partnership between Amazon and the AI music product Suno for the next generation of Alexa+. DMN quotes Amazon’s Panos Panay, SVP of Devices and Services thus:

Using Alexa’s integration with Suno, you can turn simple, creative requests into complete songs, including vocals, lyrics, and instrumentation. Looking to delight your partner with a personalized song for their birthday based on their love of cats, or surprise your kid by creating a rap using their favorite cartoon characters? Alexa+ has you covered.

The first time I read about Suno, it struck me as a gimmick that may not attract or sustain enough market interest to be profitable. Just the example cited above of making personalized birthday songs seems like the kind of thing a household can only do a few times before it gets stale. “Surprise your kid by creating a rap…” sounds like what the kids calls “cringy.” But the broader question posed by Suno is whether consumers want “personalized” music, or whether the whole concept is the just another hubristic statement about the power of data in the arts.

There have been many arguments presented by theorists and scholars that consumer data either obviates the need for creators’ rights (copyrights) or justifies substantially limiting those rights. The general premise is that if consumer data informs creators about what audiences want, this insight lowers the risk of investing in production. Lowering that risk, say the theorists, implies rethinking copyright protection—or even rethinking the nature and value of creators, as Professors Sprigman and Rustalia proposed in a paper I critiqued in 2018.

As argued in that criticism and elsewhere on this blog, the goal of artists and creators is not necessarily to give audiences what they want. While one cannot dispute the market value of certain “formulas,” there is substantial evidence that when producers strive too hard to meet audience expectations, audiences are often disappointed. In short, risk is inherent to creative expression and audience experience.

In every medium and every genre, consumers want to be surprised by artists, and shifting modes of expression reflect artists’ personal responses to contemporary events. In general, the most successful (i.e., meaningful) works are the ones we didn’t know we wanted until we had them. And once these works become part of the vernacular of our lives, we cannot imagine living without them.

By contrast theories about the power of data as a predictor of creative success are founded in a techno-centric arrogance that, to me, is exemplified in a product like Suno. The idea that the consumer wants music to be tailored from a few instructions—“Alexa make me a punk rock song about a guy who lost his job.”—is typical of the kind of “innovation” many technologists would develop by ignoring fundamental reasons we enjoy music in the first place.

As explored in this post about opera, I agree that music, and other expressive media, can be replicated by an AI to provoke emotional responses in human observers. Simply put, if a composer knows that minor chords have a certain effect on the Western listener, then an AI can follow the same rule to produce a “melancholy” tune. But the science of music and human psychology only explains our instinctive, animal-like responses to combinations of sounds while leaving out the rest of the experience.

We cherish our playlists for reasons that transcend the sounds’ effects on our brains—i.e., transcend mere taste. We relate and return to artists or their messages; we store and recall memories in the songs we replay; and we connect to friends and family through songs we have in common. Suno, outputting a bespoke song like a tepid cocktail cannot provide any of that. On the contrary, it omits all those aspects of music that make us care about it, suggesting that its outputs are indeed gimmicks destined to become as dull as they are disposable when the short-lived novelty wears off. At least that’s my prediction.

There is, of course, a more insidious question worth asking—namely whether a product like Suno, especially when paired with Amazon, is less significant as a custom jukebox than it is as a new surveillance device. The use of personal data to micro-target and manipulate people and alter the course of major world events is not science fiction anymore. In that light, is it not conceivable that, say, 100-million people expressing their sentiments to an AI “music composer” will add color to data that will only exacerbate surveillance capitalism? That would be one hell of a way to pervert music.


Photo by: Cm2012

Music Making Gen AI: A Deeper Dive into Fair Use

fair use

In February 2023, I argued that using copyrighted works for the purpose of training generative artificial intelligence (GAI) products is not fair use. My view in that post was, and remains, that because the purpose of copyright law is to promote authorship, and authorship is human as a matter of doctrine, then a purpose which replaces authorship is facially antithetical to copyright’s constitutional foundation. So, because a finding of fair use, should, as a matter of law, further copyright’s purpose, the GAI developer’s defense under that exception is invalid.

That said, I assume the courts will not rule on this threshold, constitutional question at summary judgment and will instead conduct fair use analyses in the first cases that proceed to at least bench trial. After one or two outcomes, if favorable to plaintiffs, we will likely see a lot of settlements because so many of the AI cases alleging mass infringement for the purpose of machine learning (ML) present the same legal questions.

Turning to the recent lawsuits filed by record labels UMG, et al. against GAI developers Udio and Suno, both complaints anticipate the fair use responses to come and, to an extent, imply the doctrinal view articulated above. Because the two complaints are nearly identical in substance, I’ll stick to Udio because the suit is filed in the Southern District of New York (SDNY), and the Second Circuit is where nearly all—if not all—of the relevant fair use case law has been active or decided. Odds are, the court in the First Circuit, which has a comparatively thin copyright record, will follow the Second Circuit’s lead in resolving Suno.

As discussed in my first post about these cases, the defendants seem unable to present a plausible claim of non-infringement and even signaled that they have planned to argue fair use in the lawsuits they knew were coming. All the action will be focused on factor one, part one (whether the use is “transformative”) and on factor four (potential harm to the market for the works used).

To dispense with factors two and three, the nature of the work(s) used and amount of the work(s) used respectively, these clearly tilt in favor of the plaintiffs. The sound recordings used for ML are highly expressive in nature (factor two); and by all evidence and inferences presented, it seems clear that defendant copied whole works—and many of them—into the AI model (factor three). One way the use of whole works could swing back to favor the defendant would be finding that the purpose of the use under factor one is, indeed, transformative.

To get there, I believe the court would have to find transformativeness under its precedent in Google Books, but in addition to the court itself describing that decision as the outer boundary of fair use, the Supreme Court in Warhol may have at least sharpened, if not narrowed, that boundary. As a factual matter, a GAI like Udio is nothing like Google Books. The latter feeds whole books into a system for the purpose of creating a research tool, while the former feeds whole sound recordings into a system for the purpose of producing other sound recordings—several of which have been presented in evidence as substantially similar to famous sound recordings.

Considering Udio Under Fair Use

As mentioned, the focus will be on factors one and four, which is not uncommon, but these cases highlight the interplay between the two factors. Factor one asks the purpose of the use, including whether that purpose is commercial; and factor four asks whether the use threatens the market value for the work(s) used. Thus, if a court finds under factor one that a use serves a “substitutional” purpose, this suggests that the use will unavoidably cause harm to the market value of the works used under factor four. This is what the labels argue, but a product like Udio does imply new territory for a fair use consideration.

Factor one asks two interrelated questions—whether the purpose of the use is transformative, and whether that purpose is commercial in nature. Commercial use tilts away from fair use but is not determinative, and transformativeness tilts toward fair use, but is also not determinative. In fact, the Supreme Court decision in Warhol, reversed a trend whereby transformativeness too often carried the entire fair use analysis. For instance, Udio’s failure to license the works used for ML is itself a potential market harm under factor four. Thus, even if Udio’s purpose were held to be transformative, its commercial purpose would split factor one, and the rest of the fair use factors would likely still favor the plaintiffs.

But, as the complaint states, “[Udio is] far from transformative, as there is no functional purpose for Udio’s AI model to ingest the Copyrighted Recordings other than to spit out new, competing music files,” states the labels’ complaint. This is properly framed in context to what “transformative” means in copyright law. Transformativeness is not about technological novelty or even innovation that promises to “make the world better” and so on. Notwithstanding the hyperbole in many such claims by various developers, the transformative question in fair use focuses on distinction of purpose from the works used.

As the quote above indicates, Udio ingests (i.e., copies) sound recordings for the purpose of making other sound recordings. And the purpose of both sets of sound recordings is, generally and presumably, listening pleasure for consumers. Even if none of the music produced by Udio were substantially similar to any of the music that went in, the labels contend that the overall purpose is holistically substitutional for all the recordings used to create the product. Udio used the music of human artists to “make” music without human artists, which is a purpose far beyond the Google Books boundary of providing a research tool to humans, including some who will be authors of works.

In response, Udio can argue that the purpose of its product is to produce a plethora of “new” music, which may indeed threaten to replace artists, but which is no more a substitute for the works used than a Beyonce song is a substitute for a Taylor Swift song. This is a tricky moment for copyright, which protects authors’ rights by protecting the use of their property. For instance, if no song ever comes out of Udio that sounds like a copy of an existing song, or if Udio can show that the majority of songs output are “new,” it could argue that its purpose is not substitutional under factor one.

Additionally, if Udio could show that its purpose is substantially providing a tool for would-be music creators, its claim to being “transformative” would be stronger under Google Books. But based on the reported function and market objectives of both Udio and Suno—i.e., mass-market products enabling any consumer to “make music” with a few basic prompts—the “tool” claim, if it were made, seems unpersuasive.

Let me interject that tech developers and copyright antagonists often conflate the economic concept of “creative destruction” with transformativeness, arguing that “copyright stifles progress.” While I personally question whether Udio et al. necessarily represent progress as a cultural matter, even if Joe Schumpeter himself would agree that technological replacement of human music makers is “creative destruction,” that prospect anticipates the nullification of copyright law as a relic of impliedly obsolete human authorship. As such, it would seem preposterous for a court to find that an affirmative defense to infringement should be applied in a manner that would cause copyright law itself to implode.

None of this is to say, as indeed the complaint makes clear, that peaceful coexistence between human authors and GAI cannot come to pass. Where GAI may be used by the human creator to make an expressive work of her own mental conception, the AI product has a much stronger claim to promoting the progress of authorship. But in the case of these music making products, that does not appear to be the intent—either by design or business model. And so, to reprise the doctrinal assertion I advocate, the Google Books opinion itself states:

Courts thus developed the doctrine, eventually named fair use, which permits unauthorized copying in some circumstances, so as to further “copyright’s very purpose, ‘[t]o promote the Progress of Science and useful Arts.’” [Emphasis added]

On that basis, the Second Circuit should find that a use of protected works which is holistically substitutional for human authorship does not further the purpose of copyright and is, therefore, barred from presenting a valid fair use defense.


Image source by:

Major Record Labels Sue Gen AI Devs Suno and Udio

The most prominent copyright lawsuit against Generative AI (GAI) to date dropped yesterday when the major record labels filed complaints against developers Suno and Udio in the District of Massachusetts and the Southern District of New York respectively. This is going to be one to watch, not just because of the size of the plaintiffs and the potential for significant damages, but because the complaints, in my view, present an intriguing combination of the legal questions addressed in most, if not all, of the other lawsuits filed against GAI companies.

For instance, in NY Times v. Open AI and Concord et al. v. Anthropic, both plaintiffs make a compelling prima facie case for copyright infringement by presenting large bodies of evidence showing either literal copies or substantially similar material output by the defendants’ systems. This is distinct from some of the visual artists’ lawsuits against Gen AIs like Midjourney and DALL-E where the allegations of infringement entail more inference than direct evidence of specific works copied. Not that the visual GAIs don’t output literal copies of protected works—they do—but I do not believe a plaintiff has yet filed suit with a body of that kind of evidence.

Interestingly, the evidence presented by the record labels to show that their protected sound recordings were used to train Suno and Udio encompasses a combination of substantially similar copies in the outputs, a measure of inference, and a number of self-incriminating statements by the defendants themselves. This includes the unwise assertion made by every GAI developer that machine learning (ML) is fair use, but I’ll come back to that.

Regarding direct evidence, both complaints cite several examples whereby, with a few general prompts, the systems will output music that is substantially similar to famous songs. “These similarities are further reflected in the side-by-side transcriptions of the musical scores for the Suno file and the original recording. These similarities are only possible because Suno copied the Copyrighted Recordings that contain these musical elements,” the Suno complaint states.

See cover image from plaintiffs’ transcriptions. “Red markings in the transcriptions indicate notes that are the same as the original in both pitch and rhythm, where orange markings indicate notes that use either the pitch or the rhythm of the original, but not both.”

Akin to the NYT and Anthropic cases, the logic holds that if this material comes out of the system, then it was obviously fed into the system. More broadly, inference tells us that millions of sound recordings were used in ML to enable Suno and Udio to so effectively produce a wide variety of music in so many styles. And that’s where the self-incriminating comments come into play.

As has been reported elsewhere, Suno investor Antonio Rodriguez, is quoted in the complaint as saying, “…honestly, if we had deals with labels when this company got started, I probably wouldn’t have invested in it. I think they needed to make this product without the constraints.” Yikes. Notwithstanding the questionable claim that copyright infringement is necessary for GAI development, Rodriguez’s statement reads as an admission that of course they willfully infringed copyrights—that he went into the venture knowing he would help finance litigation.

Similarly, Udio’s CEO David Ding is quoted saying that his system needs to “train on a large amount of publicly-available and high-quality music…[the] best quality music that’s out there…obtained from the internet.” As the complaints note, “publicly-available” is a term the GAI companies like to use in PR statements, but this is not synonymous with the “public domain.” Most in-copyright works are publicly available, and Ding’s statement that sound recordings were “obtained from the internet” is, again, acknowledging that unlicensed copying—and a lot of it—occurred for the purpose of training the Udio model.

All Eyes on Fair Use

When the first Gen AI lawsuits dropped, I thought the developers might try harder to claim that no copyright infringement occurs on the basis that what’s happening inside their machines does not “copy” protected works. All that nonsense about machines “learning” the same way human artists learn, when combined with an invisible or complex process, seemed to be leading toward that argument in court. Instead, whether the evidence of copying is too obvious, or the developers are too hubristic, it appears—certainly in this case—that the Gen AI companies are stipulating to a valid infringement claim and jumping straight to a presumption that they will be rescued by a fair use defense.

As mentioned above, and as the complaints note, the assertion of fair use is itself a tacit admission that a prima facie claim of copyright infringement exists. While it will only be fun to unpack the real fair use responses when Suno and Udio submit those documents to the courts, the labels’ complaints already present rationales as to why all four factors disfavor a finding of fair use. Going forward, the fair use discussion will emphasize factors one and four—the purpose of the use and the potential market harm to the works used, respectively.

The most compelling discussion will address the extent to which the courts find that Suno and Udio’s use of the works serve a “transformative” purpose under factor one. Not only will this consideration have major implications for every Gen AI developer, but it will also be the ideological hill on which the pro and anti-copyright forces will clash. The ongoing (if repetitive) debate that pits alleged progress against allegedly outdated copyright law may be won or lost on the transformative test in these cases.

On that subject, both complaints use the language “far from transformative” to describe Suno and Udio—and I agree. Just because Gen AI is novel, or even impressive, these products do not make transformative use of protected works in a manner that furthers the purpose of copyright law, which is to foster, not replace, human authorship. This essential consideration for finding transformativeness is tacitly acknowledged by the Gen AI lobbyists and cheerleaders who insist that “copyright law must change” in for the sake of Gen AI. If the law “has to change,” then clearly, the law does not support the conduct at issue. These and other contradictions will be exciting to follow as these cases proceed.