Decoder podcast: AI could go extinct because fair use is whimsical

AI extinction

It was hard not to dismiss the headline posted by The Verge:  How AI copyright lawsuits could make the whole industry go extinct. The article summarizes a new Decoder podcast hosted by Nilay Patel, joined by Sarah Jeong to discuss the copyright lawsuits filed against generative AI developers. Most of the program is devoted to a discussion of fair use, which is reasonable because that’s likely how these cases will be decided. It’s clear that Patel and Jeong view copyright as a barrier to technological innovation, but when people trained in the law misrepresent the law as purely whimsical, it is counterproductive to the conversation.

I could critique nearly every segment in the podcast, but as that would be both long and tedious, I selected a few highlights for this post. Setting the more-hip-than-helpful tone of the program, Patel (who went to law school) describes fair use as a “vibes based” doctrine. Jeong (also law school) echoes the sentiment when she says that litigation against generative AI has “Napster vibes to it,” teeing up her thesis statement: “When Napster happened to the law, companies went bust; entire industries went bust; copyright changed forever in a way that was not great; it was an extinction level event; and AI has a similar thing going on there.” Here, Patel summarizes that Napster went to the Supreme Court—it did not—and that the Court “made some changes to copyright law.” Seriously? “Made some changes” is not how people with legal training talk about court rulings, even when they disagree with the outcome.

The next comment that caught my attention was Patel saying that “fair use is not deterministic” as a doctrine. He’s right, but in context, the listener will take him to mean that fair use is unpredictable to the point of capriciousness. Although a good attorney will demur to predict the outcome of any case, a thoughtful copyright expert is unlikely to agree that fair use findings are a “coin toss,” as Patel puts it. In fact, the choice of the word deterministic provokes the rebuttal that anticipating a fair use outcome is more accurately described as probabilistic, which is funny because that’s also how generative AI works.

If a defendant asks an attorney to handicap the likelihood of prevailing on fair use, the attorney’s response should be a reasonable prediction based on how closely the facts of the present case resemble fair use findings in the circuit of jurisdiction. Although Patel alludes to this analysis, he overlooks the fact that counsel could describe a probability outcome, which is precisely how a generative AI produces its outputs. If one prompts a visual AI to generate an image of a dolphin drinking a Slurpee, the output is the machine saying, “Based on the available data, this image is probably a dolphin drinking a Slurpee.” So, of all defendants, AI developers should grasp the nature of fair use case law.

Jeong echoes the idea that fair use considerations are erratic by alleging that the “Court changed copyright law after Napster,” referring to the Ninth Circuit’s 2001 finding that the P2P music filesharing platform was not shielded by fair use. Here, she argues that the Supreme Court’s fair use finding in the Sony “Betamax” case (1984) expressed a philosophical adaptation of copyright law to foster new technology but that this general view was reversed when the Ninth Circuit decided against Napster—and then when the Supreme Court ruled in 2005 that the filesharing platform Grokster could be liable for copyright infringement.

Although one cannot reasonably argue that ideology never skews the courts, Jeong elides the many factual and legal distinctions between the VCR and filesharing platforms and, by extension, the distinctions between those technologies and generative AI. Her declaration that “copyright was changed” after Napster and Grokster is unfounded, as the Court itself notes that Grokster was its second case considering contributory liability for copyright infringement—Sony being the first. Two cases, twenty-one years apart, addressing the same legal question presented by substantially different technologies is not a basis for claiming that the law was “changed forever” by the outcome in the latter case.

Holding the opinion that copyright stifles technological innovation does not excuse misrepresenting the courts as rolling dice to rule on fair use. For instance, in Grokster, the Court directly addresses the balance between copyright and technological innovation thus:

The more artistic protection is favored, the more technological innovation may be discouraged; the administration of copyright law is an exercise in managing the trade-off….The tension between the two values is the subject of this case, with its claim that digital distribution of copyrighted material threatens copyright holders as never before, because every copy is identical to the original, copying is easy, and many people (especially the young) use filesharing software to download copyrighted works.

Does that describe the technological function of the VCR? For those who’ve never used a VCR, the answer is No. The home video tape recorder, a relic of pre-internet life, functioned nothing like a filesharing platform, which facilitates mass copyright infringement on a global scale. Fair use is a fact-intensive inquiry, and “technology” is not a monolith. The leap from the VCR to generative AI is roughly the distance between the telegraph and the iPhone, and it is unhelpful, even irresponsible, to obscure so much factual detail behind a conversation about the courts’ alleged randomness on copyright and fair use.

Everything cited above was expressed in the first 5-6 minutes of the podcast. Tempted not to listen any further, I winced as both Patel and Jeong proceeded to make some astonishing remarks about the four-factor fair use test in regard to generative AI. Again, a couple of highlights stand out.

On factor two, nature of the work used, Patel says, “Factor two is whatever the judge thinks it is.” Then, a few seconds later, he says, “If the judge decides they don’t like the New York Times that day…” this will determine whether factor two tilts in the Times’s favor. NYT v. Open AI is before the Southern District of New York in the Second Circuit, which holds the largest trove of copyright case law of any circuit in the country—including several major fair use cases. If Patel or Jeong want to handicap the court’s findings based on that case law and then offer their own views of what they think is right, fine. But the implication that the court is just going to wing it is ridiculous.

Jeong does not push back on Patel’s coin-toss implication but says the “dial is in the middle” on factor two, which she reasonably (if not very clearly) argues because the Times contains both protectable expression and unprotectable factual material. But then comes the biggest spit-take in the program, when Jeong predicts that factor four, potential market harm to the work used, weighs against the AI developers because of the Supreme Court decision in Warhol. She states, “We have not seen that heavy an emphasis on factor four before.” Notwithstanding the fact that prior to the Campbell decision (1994), many experts would say that factor four was the most determinative factor in fair use jurisprudence, Warhol was unequivocally NOT a factor four case. As the opinion states:

In this Court, the sole question presented is whether the first fair use factor, “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes,” §107(1), weighs in favor of AWF’s recent commercial licensing to Condé Nast. [emphasis added]

From her comments about Warhol, Jeong confuses the question of “substitutional purpose” with the question of “market harm substitution,” which are weighed under factors one and four respectively. It is true that where the court finds substitutional purpose, market harm substitution is more likely to be found, but as the opinion explains the distinction in one footnote in WarholWhile the first factor considers whether and to what extent an original work and secondary use have substitutable purposes, the fourth factor focuses on actual or potential market substitution. They are two separate, albeit interdependent, questions. I do not know whence Jeong gets the idea that Warhol was a factor four case, let alone an unprecedented outcome in its emphasis of that factor.

Fair Use is a Fact-Intensive Inquiry

Generalizations like those articulated in the Decoder podcast sidestep the relevant facts about a given technology, what it does in context to legal questions, and why the technology may or may not be socially valuable. “Artificial intelligence” encompasses a wide range of development, some of which is promising, some of which is questionable, and all of which has been identified as potentially dangerous without proper oversight. As for generative AI in the creative industries, if Jeong is right that the copyright lawsuits pose an existential threat to those companies, so what? It is not clear that the world needs machines to make images of dolphins drinking Slurpees.

As discussed in this post, AI developers may have taken a gambler’s approach to fair use, and if their business plan included liability at the scale of mass copyright infringement, that’s a risk they chose to take. If any of those companies fail because of that liability, it will not be the result of whimsically applied or tech-hostile copyright law, or indeed the fault of the creators whose rights are infringed in the process of machine learning. Moreover, it is certainly not incumbent upon creators to abdicate their rights and get out of the way because “innovation” is happening. Fair use considerations in generative AI lawsuits may result in some novel opinions, but if influencers like Patel and Jeong are going to misstate case law and describe the courts as casinos, then one must wonder why they mention their legal credentials in the first place. After all, anyone can flip a coin.

NYT tech editor Jeong sticking copyright criticism where it doesn’t belong.

Holy whiplash segues, Batman.  There I was reading a perfectly interesting article by Sarah Jeong on the potential hazards of selling one’s personal data, when she took an incomprehensible—if mercifully brief—detour into the realm of copyright law.  She presents a reasonable enough case that the companies now offering to help us “broker” our private data (e.g. health information) may be counting on the fact that, “There’s no legal property right to personal data.  Once personal data is gathered, it’s out there for anyone to buy and sell. At the moment, there are no legal grounds to demand compensation for use,” Jeong writes.

Fair enough.  It is certainly true that the whole prospect of selling private data, even if it were a good idea, does implicate a relatively novel legal framework.  And while I am personally inclined to agree with Jeong that the whole notion is fraught with hazards, I am at a loss to understand where she is going with this interjection …

“In any case, we already know what happens when property rights get slapped on information, because we’ve already done it, to some degree, in copyright law. 

Giving people ownership of their creative expressions means they can buy and sell them on the open market. The risk is that an artist will wind up, like Taylor Swift, alienated from her own work because she no longer possesses the masters of some of her earlier recordings.”

Swift in late June stated publicly that she was very disappointed to learn that mega-star manager Scooter Braun will be acquiring Big Machine Label Group, which still owns her master recordings dating back to the start of her career.  Swift calls the prospect of being under contract to Braun her “worst nightmare,” and for the sake of this post, we will take her word that he is an “incessant manipulative bully” because digging into that backstory could not matter less to Jeong’s ham-fisted allusion to the supposed problem with copyright.  

Even more bizarrely, Jeong happened to pick an artist who has adamantly defended both her own rights and those of much smaller artists, and who told Rolling Stone in 2014, “Important, rare things are valuable. Valuable things should be paid for. It’s my opinion that music should not be free, and my prediction is that individual artists and their labels will someday decide what an album’s price point is. I hope they don’t underestimate themselves or undervalue their art.”  So, I’m just spitballing here, but maybe Swift did not recently do an about-face on the purpose of copyright, or even abandon all prospect of working with labels, so much as she was just saying she really does not like Scooter Braun.  

Turning to Jeong’s implications about the nature of copyright, it is clear that she should refrain from the topic altogether.  For one thing copyright does not “slap property rights onto information.”  Quite the contrary.  There is in fact a long history of statutory development and caselaw that makes it very clear that information is not the subject of copyright.  Expression is the subject of copyright, but the way Jeong slaps these two sentences together makes it seem as though information and expression are the same thing—especially in the context of an editorial that is all about data, which has no resemblance to expression.  

At that point, I guess what Jeong is trying to say is that if we can own and sell our data, then, like Taylor Swift and her masters, we could wind up very unhappy about the party that buys the data.  I think that disappointment is almost a guarantee and that we should be shoring up statutes against privacy-invasion rather than looking for ways to market our DNA profiles and whatnot.  But, that said, what in blazes does the unprecedented challenge of mass data collection and its privacy implications have to do with about three centuries (though I would argue more) constructing a legal framework for authorial rights?  Not a damn thing.

Interestingly enough, the paper written by Samuel Warren and Louis Brandeis in 1890, which is widely considered the seminal American work articulating a right of privacy, actually turns to copyright law as starting point.  Because there is no constitutional declaration of a right to privacy Warren and Brandeis begin with the already long pedigree of copyright in unpublished works when they write, “From corporeal property arose the incorporeal rights issuing out of it; and then there opened the wide realm of intangible property, in the products and processes of the mind.”  

Not only do most people, and certainly most creators, still feel that the products of the mind are a form of personal property, but this was the exact point of reference chosen by a pair of legal lions to make the case that a right of privacy actually exists.  Consequently, Jeong might want to consider the possibility that copyright law provides guidance for the protection of our personal data rather than a warning of what can happen if we become the “owners” of that data.  Or, if we’re looking for warning signs in historic property rights regimes, my friend Neil Turkewitz observes

“If property rights are the model, then Silicon Valley’s dismal track record on intellectual property rights is a giant red flag that simply vesting property rights is of little consequence to the extent that such property rights are essentially unenforceable — particularly for individuals. Since the dawn of the internet, notwithstanding their legal rights, creators and innovators have had to endure an avalanche of illegally available copies of their works online.”

So, maybe, as Warren and Brandeis noted, copyright does have something teach us about privacy that is quite different from Jeong’s misguided assumptions. But what do I know?  I’m just spitballing.

In the News: Sarah Jeong, “Fake News”, & Fair-Use

It’s another one of those weeks when there’s stuff happening faster than I can write about any one thing. So, here’s a summary of a few items of note …

Anti-Copyright Ideologue Named Tech Writer at NYT

Twitter lit up yesterday with accusations that The New York Times has named a “racist” to its editorial board, citing anti-white tweets made by technology writer Sarah Jeong, who is Asian. These complaints read like a lot of whinging nonsense, taking Jeong’s comments out of the context in which she was apparently responding (albeit ill-advisedly) to racist or sexist remarks directed at her. (God, I love Twitter for the way it brings out our better angels.)

What is notable about Jeong as the Times’s new “lead writer on technology” is that she is an anti-copyright ideologue, who has written various articles and posts in a familiar, ill-informed style akin to Cory Doctorow. In February of 2016, I wrote a fairly extensive response to several errors she made in a Motherboard editorial predicting that copyright law might enable the Chinese government to disappear the famous “Tank Man” photograph from the internet.  It’s still online of course.

So, while I truly doubt Sarah Jeong is a racist and think the people labeling her as one should get a grip, I am equally skeptical that future NYT editorials on the intersection of technology and copyright will be well-balanced—or even accurate.

New Paper on Why People Share “Fake News”

Related to the above, I notice that the National Review site has two top stories featuring Sarah Jeong, the second of which is headlined “Yes, Anti-White Racism Exists.” This dumb and bogus narrative is what academic Alice E. Marwick would identify as a “deep story” in her new paper titled Why Do People Share Fake News? A Sociotechnical Model of Media Effects. Unable to fully answer that question yet, Marwick provides a complex nuanced framework for further discussion, identifying socio-cultural factors that cannot be overpowered by solutions like fact-checking.

Although the volume of what Marwick calls problematic information is greater among the contemporary “right” at present, the contemporary “left” is by no means immune to the underlying reasons why people are apt to believe and spread “fake news,” hoaxes, and other forms of disinformation. I’m working on a longer post summarizing Marwick’s paper, but for those interested, her full paper is here.

TVEyes Files for Cert at Supreme Court

Filing a petition for Supreme Court hearing in its ongoing litigation with FOX News, TVEyes hopes to get another shot at presenting arguments that failed in the Second Circuit in February of this year. Eriq Gardner for The Hollywood Reporter writes, “TVEyes’ attorney tells the Supreme Court that the 2nd Circuit decision conflicts with precedent and ‘creates a circuit split over a question of exceptional importance, including the proper balance under copyright law between the interests of a copyright holder and the First Amendment right to criticize and comment upon the copyright holder.’”

There is no brief to review yet, but that statement alone, taken from a request for an extension to file, does not seem to bode well for the Supreme Court granting cert for a couple of reasons. The first, as detailed in this post, is that the same appellate court that ruled in favor of Google Books also drew sharp distinctions between that case and TVEyes (ergo, maybe not so much of a split). The second reason is that it is consistent with precedent to hold that the First Amendment rights of users of a service do not automatically make the service itself non-infringing. This is a chronic argument made by tech-industry players, and as described in this post, courts generally take a dim view of corporations that attempt to “stand in the shoes” of their customers.

I’ll be surprised if SCOTUS agrees to review this case, but if it does grant cert, expect a storm of amicus briefs to follow.

EFF Honors Itself With Its Own Award

In a July 30 announcement, the Electronic Frontier Foundation named Stephanie Lenz, creator of the “Dancing Baby” video, among the recipients of this year’s Pioneer Award. “Stephanie Lenz’s activism over a home video posted online helped strengthen fair use law and brought nationwide attention to copyright controversies stemming from new, easy-to-use digital movie-making and sharing technologies.” Many of us will never experience the injustice of having a video removed and then restored to YouTube, but in that silent interval, when people could not watch Lenz’s baby boy dancing in the kitchen, her world—indeed the whole world—was just a little bit darker.

I wrote a post in October of 2016 summarizing the narrative of this decade-long EFFishing expedition; but suffice to say this award-earning “activism” did not even begin as a fair use case; “Fair-Use Champion” Stephanie Lenz stated her own ambivalence about the video remaining on YouTube; the fair use/DMCA argument itself is razor thin; and I would bet anything that, beyond us copyright watchers, “nationwide attention” sounds something like this: Oh yeah, didn’t Prince sue some mom? And that didn’t even happen.

So, in the same way that Stephen Carlisle described Stephanie Lenz as the “nominal plaintiff” in Lenz v. UMG, it seems reasonable to call her the nominal recipient of this award, which should rightly go to the EFF’s own Corynne McSherry for Outstanding Achievement in PR Through Boondoggle Litigation.