Is Site Blocking Finally Within Sight?

With all the talk about AI, one might think the problem of old-school media piracy has abated, but this week, the House Judiciary Committee held a hearing entitled Digital Copyright Piracy:  Protecting American Consumers, Workers, and Creators. Although much of the conversation was familiar territory (i.e., the economic value of the creative industries and the cost of piracy), the legislative question in the room was whether the United States will finally adopt site blocking provisions as many other nations have done. In her testimony, Motion Picture Association (MPA) general counsel Karyn Temple stated:

…over the past decade, more than 40 countries, including leading democracies such as the U.K., much of Western Europe, Canada, Australia, India, Brazil, South Korea, and Israel, have enacted no-fault injunctive relief regimes that expressly authorize courts or administrative agencies to issue orders directing internet service providers (“ISPs”) and other online intermediaries to disable access to websites dedicated to piracy. Pursuant to these laws, courts and administrative agencies have disabled access to more than 90,000 domains used by over 27,000 websites engaged in blatant piracy after affording full due process.

“No-fault injunctive relief” and “full due process” is key language to keep in mind as Congress re-opens this discussion and the self-appointed defenders of the internet respond like Sauron’s orcs to the battle cry. After all, things got a bit heated “twelve years ago,” as noted by Rep. Zoe Lofgren in reference to the SOPA/PIPA legislation that was doomed by an extraordinary disinformation and fear-mongering campaign coordinated and funded by the internet industry. And although that story ought to be old news, the testimony of Matt Schruers, president of the Computer and Communications Industry Association (CCIA), rang the “Stop-SOPA” bell with statements like the following:

Content filtering by automation is not always effective or accurate. In particular, “off-the-shelf” filtering technologies tend to be focused only on specific classes of works, and cannot necessarily provide meaningful protection to content on sites whose users can create many different types of works. Automated tools are also unable to take into account context or nuance of individual uses, so may result in over-removal of non-infringing, fair uses. These false positives merit particular attention because any unjustified content filtering or takedown may suppress lawful expression.

That commentary is dog-whistling because it has nothing to do with the purpose of, or mechanisms inherent to, site blocking. Schruers is referring to imperfections in the DMCA notice-and-takedown provisions, exaggerating its effects on protected speech, and eliding the fact that a distinguishing aspect of a site blocking provision is that it requires a party to present evidence to obtain a court order and provides ample opportunity for both service providers and the allegedly infringing website to rebut the evidence. No party would be empowered to “automate” site blocking the way that, for instance, copyright owners can automate DMCA takedown notices.

Homing in on Schruers’s rhetoric, the highlight of the hearing was arguably Rep. Ted Liu, who used his phone to access the pirate site F Movies, which he confirmed with Ms. Temple cannot be accessed in most of Europe. Emphasizing the fact that the F Movies site has been available to Americans since 2016, Liu stated, “We’re trying to be reasonable here. This is such an unreasonable case. This is so clearly online piracy, copyright infringement, and you don’t want your organization, your members, defending something so blatantly unlawful and unreasonable. I just ask your members to block that site today.”

In response, Schruers first noted that the broadband providers were not testifying, but Liu pressed on, “You cannot defend this. This is not defensible.” Schruers stated that his members are also content creators, that piracy is a shared concern with other content creators, and then reiterated the argument that the best remedy to piracy is more widespread, legal, availability of more content.

This rhetoric, dating back to NAPSTER (1999), has not aged well in a time when, if anything, consumers often feel that there are too many channels requiring too many subscriptions. But that is a business narrative still evolving in the streaming market, and not one that justifies access to pirate sites. More to the point, the “more access” argument completely ignores the myriad reasons to finally adopt site blocking, even if the harm to content creators were minimal. 

For instance, Rep. Lofgren resurfaced the prospect of prohibiting payment processors (i.e., credit card companies) from doing business with the pirate sites, but as film producer Richard Gladstein noted, the pirate’s revenue is not derived solely, if at all, from traditional credit card transactions. Although Mr. Gladstein did not go into much detail, he did mention the use of cryptocurrency in illegal trade of this nature, and Rep. Lofgren failed to note that voluntary initiatives between copyright owners and payment processor companies to prevent known infringing sites from accessing payment networks have existed for years and only do so much to stifle piracy.

Moreover, as reported on this blog in several posts, Digital Citizens Alliance has provided extensive reports on the complex, malware-based, dark web market for which pirated media is merely used as bait. Thus, even if not a single professional in media production were financially harmed by piracy, the use of media piracy as a conduit to more dangerous forms of cybercrime is reason alone for Congress to finally block these sites from access to the U.S. market.

Of course, piracy is a threat to not only creators, but everyone involved in bringing entertainment, including live broadcasts of sporting events, to fans. As described Riché McKnight, general counsel for the Ultimate Fighting Championship, “UFC estimates that within hours of a single UFC event, hundreds of thousands of viewers may have already seen infringing versions of the event…UFC further estimates that due to piracy, multiple millions of dollars are diverted from legitimate purchases of UFC content each year,” McKnight states in his written testimony.

McNight’s testimony also highlights a major problem with the DMCA — that while it calls for service providers to take down infringing content “expeditiously,” there is no clear definition of that term. This is extremely problematic for industries broadcasting live sporting events, where the value of the broadcast may last minutes or seconds and then diminish greatly once the event concludes.

What About Felony Streaming?

In 2020, against the objections of the usual anti-copyright parties, the Protect Lawful Streaming Act was passed, which made enterprise-scale piracy by means of streaming a felony rather than a misdemeanor. The question as to how effectively the Justice Department has used this provision was raised in the hearing, perhaps as a distraction from site blocking, but there are at least two answers to why PLSA is not a complete remedy for piracy. One is of course the resources of the DOJ, and the other is that site blocking provisions exist to prevent access to the domestic market by sites operating outside U.S. jurisdiction.

As Chairman Darrell Issa noted at the end of the hearing U.S. Customs and the International Trade Commission are empowered to stop the importation of physical goods that violate intellectual property law. As such, he asks, “Today, aren’t we just talking about finding the equivalent of what for two-hundred plus years, our Customs and other agencies have done when there is due process and entities such as Article III courts have reached a decision, the execution of that protection is done by our government, or on behalf of our government, by orders to those who participate in brining things into the United States?”

Perhaps not the most concisely worded question, but it is exactly right. The U.S. bars illegal goods from overseas from entering the country, and there is no threat to constitutional principles for doing likewise when the means of “importation” is digital transmission. Moreover, as stated here many times, an infringing digital transmission of a work can cause immensely more damage than even thousands of physical bootlegs. Assuming the HJC proceeds toward site blocking legislation, I imagine we’ll hear some SOPA-like noise begin to rumble online. But based on my read of that hearing and the market overall, I wouldn’t expect that noise to make much difference this time.

Generative AI is a lot Like a Video Tape Recorder, No?

In my last post, I focused on the hypothetical fair use defense of generative AI under the principles articulated in the Google Books decision of 2014. In this post, I want to address another claim that has arisen—both on social media, and in comments to the Copyright Office—namely that generative AI companies should be shielded against secondary liability for copyright infringement under the “Sony Safe Harbor.”

This refers to the 1984 Supreme Court decision in Sony v. Universal (The “Sony Betamax” Case), holding that the video tape recorder (VTR) is legal based on two interrelated findings: 1) the fair use opinion that consumers had a right to “time-shift” the viewing of televised material; and 2) therefore, the VTR would be used for substantially non-infringing purposes. Thus, although some parties would inevitably use the VTR for infringing purposes, Sony Corporation could not be liable for contributory infringement in such instances.

Clearly, there are some bright, shining distinctions between the VTR and a generative AI. The VTR was not designed by inputting millions of AV works into a computer model, and its purpose was not to generate “new” AV works. Instead, those obsolete machines performed two very basic functions: they made videotape copies of AV material, and they displayed copies of AV material for a specific type of personal use.[1] As noted in the post about Google Books, the Court in Sony also had a fully developed product and a clearly defined purpose in the VTR. And again, this is not so with respect to understanding the purpose of a given generative AI.

I believe the novelty (and even the uncertainty) of the AIs purpose is fatal to the argument that generative AI companies are necessarily shielded by the “Sony Safe Harbor.” This is because in Sony, the anticipation of substantially non-infringing use rests on the novel “time-shifting” notion introduced into the fact-intensive fair use finding. In other words, “time-shifting” was a principle specific to the technology at issue, and no analogous concept lurks anywhere in the purpose of a given AI, let alone all AIs still in development. Imagine if Sony Corp. walked into court with a box of assembled electronic parts, declared that they’re not quite sure what the box can or will do yet (though it might distribute homemade copies into the market!), but they would really like a fair use decision and liability ruling in their favor.

Non-Infringing Use Under Different Rationales

To be clear, it is plausible—even reasonable—to expect that the majority of outputs by a generative AI are, or will be, non-infringing. In fact, I believe this is one of the pitfalls when it comes to hoping that copyright can address the presumed threat of AI outputs:  because the substantial similarity bar, finding that Work A infringes Work B, is thrown into a doctrinal tailspin. For example, when a person knowingly copies a work, this fosters a strong claim of infringement, but independent creation is a non-infringing act. And then, there are shades in between willful infringement, innocent infringement, and non-infringement, depending on the facts of a particular case.

In addition to copyright’s limiting doctrines, which allow myriad “similar” works to coexist without legal conflict, I predict that generative AI has the potential to warp the evidentiary foundations necessary to a substantial similarity test to prove infringement. If that is correct, it may be one rationale for predicting widespread non-infringing use, but it is highly distinguishable from the foundations for the “Sony Safe Harbor.” Meanwhile, the consideration of secondary liability (as with fair use) depends substantially on the purpose of the technology at issue—and that purpose remains unclear.

The mundane, mechanical VTR only potentially threatened the “making available” rights for works produced and owned by creators. This is not remotely comparable to a computer model “trained” with millions of protected works for the purpose of enabling that computer model to produce new “works.” To paraphrase my brief comments to the Copyright Office, if a particular work goes into the machine and a potentially infringing copy of that work comes out of the machine, I do not believe there is any authority which broadly shields the developer from liability.

With that example in mind, though, it is worth noting that a code-based service, unlike a physical electronic device, can be revised concurrent with delivery to the market. Thus, unlike Sony and its Betamax, the AI developer looking to limit liability for copyright infringement has the opportunity (dare we say obligation?) to make every effort to design and continually update a system to avoid copyright infringement. This may entail licensing materials used to “train” a generative AI and/or ongoing tweaking of the algorithm to avoid infringing outputs. Either way, if the developers don’t want to build these kind of safeguards for the most revolutionary tech of 2023, surely they cannot be allowed to hide behind a liability shield established in 1984 for a box now collecting dust in the attic.


[1] They also frustrated many consumers who tried to set the clocks, but that’s another matter.

Photo by: Tamer_Soliman

The Generative AI Fair Use Defense Under Google Books

After the Supreme Court’s decision in AWF v. Goldsmith restored what many of us view as common sense to the fair use doctrine of transformativeness, the flurry of litigation against AI developers will test the same principle in a different light. As discussed on this blog and elsewhere, caselaw has produced two frameworks for considering whether the “purpose and character” of a use is transformative. One focuses on differences in expressive elements, like the use of Goldsmith’s photograph to make Warhol’s silkscreen; and the other considers a use made for a unique purpose, like the millions of scanned books used to produce the Google Books search tool.

In Warhol, the Court affirmed that transformative expression must contain some element of “critical bearing” (i.e., comment) upon the work(s) used, and this concept, tied to the different character of work, is distinguished from the use of copyrightable works to create a tool or product that may be considered transformative because it is novel and beneficial for society. Notwithstanding the possibility that generative AI may prove to be harmful to society, the copyright question of the moment is whether the use of many millions of protected works to “train” these models is transformative under the same reasoning applied in Authors Guild v. Google Books (2015).

Because the Google Books search tool could only be developed by inputting millions of digitized books into the database, the argument being made is that this is obviously analogous to ingesting millions of protected works for AI training. And certainly, no one could doubt that generative AIs are novel, even revolutionary. But this may be where the comparisons end under the fair use factor one, which considers the purpose of a use, inherent to which is a “justification for the taking.”[1]

The factor one decision in Google Books turns substantially on the court’s finding that the search tool provides information about the works used. “…Google’s claim of transformative purpose for copying from the works of others is to provide otherwise unavailable information about the originals,” the opinion states. While Google Books “test[ed] the boundaries of fair use,” the court held that the search tool furthered the interests of copyright law by providing various new ways to research the contents of books that would otherwise be impossible. Although unstated (because it would have been absurd), the recipients of the information provided by Google Books were/are human beings. And especially if some of those human beings use the information obtained to produce and/or engage with expressive works, the finding of fair use fulfills copyright’s constitutional purpose to “promote progress.”

Generative AI developers may try to argue that the use of creative works for training serves an “informational” purpose, but unlike Google Books, the information obtained from the ingested works only “informs” the machine itself. A generative AI does not, for instance, provide the human user with new ways to learn about Renaissance painting (or point to Renaissance works) but instead trains itself how to make images that look like works from the Renaissance.[2] Setting aside the cultural debate about the value of such tools, the purpose of the generative AI is clearly distinguishable from the reasoning applied in Google Books.

As discussed in an earlier post, a consideration of AI under fair use should turn on the question of promoting “authorship,” lest the courts become distracted by the broadly innovative nature of these systems—especially for any purpose outside the scope of copyright.[3] In that post, I argued that generative AIs do not promote “authorship,” and I would die on that hill, if the developers’ expectation is that these tools will autonomously generate “creative” works without any human involvement.

For instance, if “singer/songwriter” Anna Indiana is a primitive example of what’s to come—and my understanding is that this is exactly what the AI models are designed to do—then the “purpose” of these systems is not to promote authorship, but to obliterate authorship by removing humans from the “creative” process. As such, the fair use defense cannot apply because without the element of authorship, the consideration is no longer a copyright matter.

On the other hand, as stated in my comments to the Copyright Office, it is conceivable that a human author might “collaborate” with an AI tool to produce a work that meets the “authorship” threshold. For instance, by using a set of prompts that articulate sufficient creative choices in the production of a visual work (or by uploading one’s own work and using an AI tool to modify it), one can make a reasonable argument that this constitutes “authorship” under copyright law. This is one potential purpose of generative AI, and one which could favor a finding of transformativeness under similar principles articulated in Google Books.

But Google Books did not present the court with so many unknown, relevant questions of fact.

The purpose of the Google Books search tool was clearly defined and fully developed when that case was decided in 2015. By contrast, fair use defenses of AI today are presented on behalf of technologies whose development is nascent and exponentially dynamic. Simply put, we do not know yet whether a particular generative AI will promote authorship or become a substitute for authorship—the former being favorable to a finding of fair use, the latter being fatal to such a finding. Here, proponents may argue that so long as there is a mix of uses, resulting in both authored and un-authored outputs, this is sufficient to find the purpose of a given AI transformative, but it seems likely that the current docket of cases will be decided before enough determinative facts can be known.

For now, it is worth remembering that sweeping statements alleging that generative AI training is “inherently fair use” are anathema to a doctrine that rejects such generalizations. Fair use remains a fact-intensive, case-by-case consideration, and one of the many difficulties with AI is that relevant facts are not only evolving, but they describe technologies unlike anything that has been examined under the fair use doctrine to date.


[1] Citing Campbell, informing both Google Books and Warhol.

[2] I recognize that this is an oversimplification of what the AI can do.

[3] i.e., AI’s potential applications in areas like medicine or security should be dismissed as irrelevant to a fair use consideration of generative AIs that make “creative” works.

Photo by: chepkoelena531