Why Internet Archive is in Legal Trouble and Deserves to Be

My last post about the case Hachette et al., v. Internet Archive was angry. Moved by the compelling testimony author Sandra Cisneros wrote to the court, I was and remain pissed off at those who justify what amounts to enterprise-scale book piracy by dressing it up in the rhetoric of progressive lingo and academic theory. Many amicus briefs, authored by familiar names in anti-copyright academia, have been filed in support of Internet Archive.  I could pore over every one of those documents, but the only reason to do so is my admittedly morbid and nerdy fascination with the way each author will try to argue that what IA is doing is already exempted by the Copyright Act. But I ain’t got that kinda time. And it ain’t necessary. Because it ain’t so.

The reason I opined in my last post that this case should be short work for the court is that IA’s arguments boil down to two defenses, both of which should be overwhelmed by the facts and relevant case law. Defense Number One is that the IA lending model called “Controlled Digital Lending” (CDL), a model of its own invention, falls within the exceptions already carved out for libraries by statute. And Defense Number Two is, of course, that the CDL model is fair use, beginning with the claim that it is “transformative” under factor one of the analysis.

CDL: It Ain’t on the Page

Regarding the first defense, the CDL theory may sound reasonable on the surface. A library buys or legally obtains a print copy of a book and scans it to make a digital copy (its own ebook). Then, in principle, the library loans the digital copy to one patron at a time and does not loan more digital copies than it has physical copies in its collection. Additionally, the CDL model asserts that the library should not loan physical and digital copies at the same time.

But IA’s difficulty in defending CDL is twofold. First, there is nothing in the copyright statute that allows the practice; and second, it appears that IA does not even adhere to the boundaries of CDL, if it were allowed. Remember that what triggered this lawsuit was IA’s “National Emergency Library,” when it released over one million titles without restriction, using the pandemic as a rationale for doing so.

IA and its amici will take circuitous paths through Sections 108 and 109 and try to stitch together a rationale for a lending regime that was never anticipated by these sections in the law. In the first part (§108), carve outs for libraries specifically exempt limited conduct like preservation, inter-library loan, certain research activities, etc. But nowhere do any of the exceptions even imply that a library may produce and distribute its own trade-pub ebooks for the sole purpose of bypassing the licensing models under which ebooks are currently loaned. Even more damning is the fact that Internet Archive is not a library under the terms of the statute and, the court may find that it does not even qualify for §108 exceptions, let alone that those exceptions encompass CDL.

As for Section 109, IA and its amici will try to argue that because the original, legal purchase of a physical copy extinguishes the rightsholder’s interest in that copy under the “first sale” doctrine, this somehow extinguishes the copyright rights prohibiting the reproduction and distribution of a digital book made from the same physical copy. This is fantasy. Pull a book off your shelf, scan it, and make it available to the public, and you will violate the reproduction right, the derivative works right, and the distribution right of the copyright owner. That IA is engaged in precisely this activity at scale is normally described as enterprise piracy, not library lending.

But even if CDL were permissible by law, the allegations in the publishers’ motion for summary judgment (MSJ) about IA’s operations suggest that the organization is not even complying with the “controlled” part of the regime. The publishers MSJ avers …

Ironically, the many thousands of hard copy books IA obtains from defunct colleges or libraries will likely end up in “archive facilities in Richmond, CA,” which consist of large shipping containers owned by IA. Once locked away, upon information and belief, IA will make no effort to make the print books available to be read, like books in actual library collections. Instead, the print copies primarily exist to rationalize, or provide the predicate for, IA’s argument that there is a one-to-one correlation between print copies legitimately owned and their illegitimate ebook scanned copies.

So, is there any “control” proving that containers filled with books serve as the foundation for IA’s proper accounting of its digital lending? Even more bizarre is that IA allegedly asserts that its “ownership” of print books includes books on the shelves of “partner libraries.” Again, from the MSJ…

With respect to the Website’s titles for which the corresponding print books are allegedly stored at partner libraries, it defies reason that the partner libraries will have the wherewithal to faithfully and consistently remove a book from circulation each time it is borrowed on the Website, and put it back on the shelf when the Website version is checked back in.

You see where that’s going, right? If IA “partners” with enough libraries, it would then justify mass distribution of the ebooks it makes and, apparently, without any control whatsoever. So, even if a court held that the CDL concept falls within the exceptions in the Copyright Act (and this is highly unlikely), these are just two of the facts presented in the publishers’ MSJ indicating that IA is not in compliance with its own theoretical practice. And, naturally, the implications do not stop with books. After Brewster Kahle’s “New Library of Alexandria” swallows the entire commercial market for ebooks, music, motion pictures, video games, etc. would quickly follow. So, assuming the argument that “CDL is legal” winds up skidding hard against the language and intent of the statutes, let’s talk fair use.

Stop Me if You’ve Heard This One

Internet Archive alleges that its conduct is fair use, which is amusing and should easily be denied based on more than one precedent in the same circuit. I say amusing because Internet Archive’s mission is a crusade predicated on the devout certainty that this part of its operation is permitted within the statutory exceptions for libraries. But just in case that argument fails, IA will plead fair use. I mean, yeah. It would be bad lawyering if they didn’t. But it’s still funny.

In the simplest analysis, IA’s conduct exceeds the boundaries established by the Second Circuit in Google Books and which was reaffirmed in ReDigi and TVEyes. The Google Books opinion begins with the statement, “This copyright dispute tests the boundaries of fair use.” And that was Judge Leval, who wrote the paper introducing the doctrine of “transformativeness” to the fair use analysis. There, the Google Books search engine, which necessitated digitizing millions of books, was held to be “transformative” under factor one of the analysis because it “added new utility” (i.e., a research tool the world did not have). But essential to that holding was the fact that Google Books does not make whole, in-copyright books available to the public. As the opinion states …

With respect to the first factor test, it favors a finding of fair use (unless the value of its transformative purpose is overcome by its providing text in a manner that offers a competing substitute for Plaintiffs’ books …

That “competing substitute” language is fatal to IA’s argument that it is “transformative” under the same “new utility” doctrine. In fact, because IA clearly provides a substitute for licensed ebooks, it cannot reasonably argue that it provides a new utility at all. It simply provides unlicensed digital books in lieu of licensed digital books. Indeed, enterprises more innovative than IA (e.g., TVEyes) have tried to argue “transformativeness” under the Google Books utility doctrine, and they have failed by the light of the same market substitute boundary.

When your only innovation is giving away for free that which the copyright owner intends to sell, there is nothing fair use offers as a defense. In the fair use analysis, there is always a strong interplay between factor one (purpose of the use) and factor four (potential market harm), but here the questions are almost identical because IA’s purpose is nothing more than market substitution.

So, Internet Archive will continue to make noise on Twitter and elsewhere. It will continue to portray itself the underdog, standing in the shoes of all librarians against the juggernaut of the publishing industry. And it will continue to elide or distort the authors’ interest in the narrative. But as a legal matter, for the reasons stated, I think IA should lose, and lose big. And that will be just fine for real libraries because real libraries do not engage in the conduct alleged in this case. And that will be the subject of a future post.


Photo source by: Janpietruszka

What Kind of Writer Indeed?

In a recent post entitled What Kind of Writer Accuses Libraries of Stealing?, Maria Bustillos stakes out a wide swath of moral high ground in defense of Controlled Digital Lending (CDL). CDL is a theory that libraries are allowed, within the boundaries of U.S. copyright law, to scan physical copies of legally obtained books and then loan the digital copies to one reader at a time, controlled by technical measures to prevent theft or unlicensed distribution.

Conceived by legal scholar and librarian Michelle Wu (and advocated by library associations and anti-copyright ideologues alike), CDL looks reasonable on the surface but is actually more complicated than Bustillos et al either recognize or are willing to admit. Nothing wrong with having an opinion, of course, but to pretend that the ebook market is not distinctive and then call anyone who points to the complexities “greedy and unethical” is just foot-stomping.[1]

In that spirit, Bustillos’s post is a response to a Twitter squabble that began with some pushback by Neil Turkewitz to her tweet praising the Internet Archive and defending CDL. Turkewitz tagged authors John Degen and T. J. Stiles along with the Authors Guild, which Bustillos refers to as summoning “a brigade,” and after describing her interactions with Stiles and the AG, she writes …

As a lifelong fan and beneficiary of libraries, as well as a working writer, I find the suggestion that libraries are trying to steal from writers very very offensive. I see no evidence for it. CDL doesn’t “devalue the labor of working authors” in the slightest. It protects and helps us, by codifying simple rules for preserving our work, and making it legally available to the public to try out through libraries.

Based on that paragraph, I would assume that Bustillos is unaware of, rather than intentionally obfuscating the much broader copyright narrative in which CDL is a small fragment. Certainly, she reveals more attitude than understanding when she writes that the Authors Guild litigation against Hathi Trust (2013) is “at heart” the same issue in the lawsuit filed by the publishers against Internet Archive (2020). Because the cases are not comparable.

Hathi Trust created a searchable database and made certain works accessible to persons with disabilities but did not make whole works under copyright available to the general public. By contrast, IA is being sued because it arbitrarily distributed over a million in-copyright books without license or even the controls called for in CDL. The irony here is that if Bustillos, or anyone else, wants to assert that CDL is narrow and reasonable, IA is the last organization to cite as an ally because it did not even respect the boundaries of CDL—and because IA founder Brewster Kahle’s anti-copyright vision is expansive. But Bustillos reveals that perhaps her sights look beyond CDL as well when she writes …

The trend started with software—you used to be able to own Photoshop and Office, but now you have to rent them—and has spread to movies, music and other media. The perpetual annuity model, needless to say, is very popular with Wall Street. Available evidence suggests that the endgame here, too, is eventually to go over entirely to a books-for-rent model.

Here again, Bustillos expresses more attitude than cogent argument that has much, if anything, do with CDL. It’s true that we now license, for instance, Microsoft Office month-to-month instead of purchasing the software, but price-wise, it’s about the same or less than it used to be, and overall convenience and security is generally better than the days when we had to buy upgrades delivered in boxes full of disks.

More to the point, ebooks are not comparable to software vis-a-vis upgrades, etc., but that’s why I highlighted the paragraph—because Bustillos is making a loose comparison for emotional impact rather than presenting a serious case for her position on CDL. Moreover, she endorses, perhaps inadvertently, an enthusiasm for CDL which is not limited to the mechanisms in that proposal but is intertwined with a broader criticism of licensing regimes throughout the digital market.

Speaking of apples and oranges, Bustillos inscrutably contrasts Neil Gaiman’s 2011 observations that piracy led to discovery and sales of his books against comments by Degen and Stiles about CDL in 2022. She cites Gaiman to make the point that lending books, especially by libraries, should not be seen as lost sales. This is generally true but is also a misdirection away from the crux of the debate over the mechanisms proposed by CDL—to say nothing of the broader anti-copyright strategy of which CDL is one prong. Further, it shows poor taste to cherrypick an unrelated comment made by a multimillionaire author (because he has greatly benefitted from the copyright system) in order to disparage authors of more modest income, who are intimately engaged with the copyright narrative nearly every day.

Perhaps Bustillos is unaware of the broader agenda being pushed by the scholars, ideologues, and lobbyists with whom she is breaking bread in her post. Even if CDL were a modest and simple proposal on its own, it almost doesn’t matter at this point because the library associations are engaged in a multi-level campaign against core principles of copyright law, and which would affect more than ebooks.

As discussed recently, the library associations have lobbied for legislation in six states proposing compulsory licenses for ebooks in a manner that is so clearly preempted by federal law that New York’s governor already vetoed its bill on that basis alone. So, why are these groups spending millions to pass legislation that is doomed to fail on constitutional grounds? Probably because failing in the states is a well-known path to lobby Congress to change the federal law.

So, as long as we’re fighting over the moral high ground, let’s consider the cost to state taxpayers to pass and defend ill-fated legislation and then compare that to the cost of ebook licensing from which the taxpayer is allegedly being rescued. Quick math:  400 titles x $32 per title/year x 25 library systems = $320,000/year per state. [2] What will Maryland spend to lose the lawsuit it now faces with the publishers over enforcement of its ebook bill?

I’m not saying I know exactly how the numbers shake out, but the library associations et al don’t present their economic complaint in economic terms in the first place. Like Bustillos, they generally vilify publishers, ignore the complexity of a system that includes many kinds of authors, and pretty much make a hash of copyright law in the process. The one thing Bustillos said with which I do agree is that Twitter fights are generally useless, but then I don’t know why she said that as a prelude to writing a long Twitter rant expressing more dudgeon than knowledge regarding these issues.


[1] Read Section 108 of the Copyright Act sometime, and if you don’t fall asleep, you will notice the strict and narrow conditions under which libraries are allowed to make or distribute copies of certain types of works.

[2] For reference, NYS has 23 library systems.