Why Internet Archive is in Legal Trouble and Deserves to Be

My last post about the case Hachette et al., v. Internet Archive was angry. Moved by the compelling testimony author Sandra Cisneros wrote to the court, I was and remain pissed off at those who justify what amounts to enterprise-scale book piracy by dressing it up in the rhetoric of progressive lingo and academic theory. Many amicus briefs, authored by familiar names in anti-copyright academia, have been filed in support of Internet Archive.  I could pore over every one of those documents, but the only reason to do so is my admittedly morbid and nerdy fascination with the way each author will try to argue that what IA is doing is already exempted by the Copyright Act. But I ain’t got that kinda time. And it ain’t necessary. Because it ain’t so.

The reason I opined in my last post that this case should be short work for the court is that IA’s arguments boil down to two defenses, both of which should be overwhelmed by the facts and relevant case law. Defense Number One is that the IA lending model called “Controlled Digital Lending” (CDL), a model of its own invention, falls within the exceptions already carved out for libraries by statute. And Defense Number Two is, of course, that the CDL model is fair use, beginning with the claim that it is “transformative” under factor one of the analysis.

CDL: It Ain’t on the Page

Regarding the first defense, the CDL theory may sound reasonable on the surface. A library buys or legally obtains a print copy of a book and scans it to make a digital copy (its own ebook). Then, in principle, the library loans the digital copy to one patron at a time and does not loan more digital copies than it has physical copies in its collection. Additionally, the CDL model asserts that the library should not loan physical and digital copies at the same time.

But IA’s difficulty in defending CDL is twofold. First, there is nothing in the copyright statute that allows the practice; and second, it appears that IA does not even adhere to the boundaries of CDL, if it were allowed. Remember that what triggered this lawsuit was IA’s “National Emergency Library,” when it released over one million titles without restriction, using the pandemic as a rationale for doing so.

IA and its amici will take circuitous paths through Sections 108 and 109 and try to stitch together a rationale for a lending regime that was never anticipated by these sections in the law. In the first part (§108), carve outs for libraries specifically exempt limited conduct like preservation, inter-library loan, certain research activities, etc. But nowhere do any of the exceptions even imply that a library may produce and distribute its own trade-pub ebooks for the sole purpose of bypassing the licensing models under which ebooks are currently loaned. Even more damning is the fact that Internet Archive is not a library under the terms of the statute and, the court may find that it does not even qualify for §108 exceptions, let alone that those exceptions encompass CDL.

As for Section 109, IA and its amici will try to argue that because the original, legal purchase of a physical copy extinguishes the rightsholder’s interest in that copy under the “first sale” doctrine, this somehow extinguishes the copyright rights prohibiting the reproduction and distribution of a digital book made from the same physical copy. This is fantasy. Pull a book off your shelf, scan it, and make it available to the public, and you will violate the reproduction right, the derivative works right, and the distribution right of the copyright owner. That IA is engaged in precisely this activity at scale is normally described as enterprise piracy, not library lending.

But even if CDL were permissible by law, the allegations in the publishers’ motion for summary judgment (MSJ) about IA’s operations suggest that the organization is not even complying with the “controlled” part of the regime. The publishers MSJ avers …

Ironically, the many thousands of hard copy books IA obtains from defunct colleges or libraries will likely end up in “archive facilities in Richmond, CA,” which consist of large shipping containers owned by IA. Once locked away, upon information and belief, IA will make no effort to make the print books available to be read, like books in actual library collections. Instead, the print copies primarily exist to rationalize, or provide the predicate for, IA’s argument that there is a one-to-one correlation between print copies legitimately owned and their illegitimate ebook scanned copies.

So, is there any “control” proving that containers filled with books serve as the foundation for IA’s proper accounting of its digital lending? Even more bizarre is that IA allegedly asserts that its “ownership” of print books includes books on the shelves of “partner libraries.” Again, from the MSJ…

With respect to the Website’s titles for which the corresponding print books are allegedly stored at partner libraries, it defies reason that the partner libraries will have the wherewithal to faithfully and consistently remove a book from circulation each time it is borrowed on the Website, and put it back on the shelf when the Website version is checked back in.

You see where that’s going, right? If IA “partners” with enough libraries, it would then justify mass distribution of the ebooks it makes and, apparently, without any control whatsoever. So, even if a court held that the CDL concept falls within the exceptions in the Copyright Act (and this is highly unlikely), these are just two of the facts presented in the publishers’ MSJ indicating that IA is not in compliance with its own theoretical practice. And, naturally, the implications do not stop with books. After Brewster Kahle’s “New Library of Alexandria” swallows the entire commercial market for ebooks, music, motion pictures, video games, etc. would quickly follow. So, assuming the argument that “CDL is legal” winds up skidding hard against the language and intent of the statutes, let’s talk fair use.

Stop Me if You’ve Heard This One

Internet Archive alleges that its conduct is fair use, which is amusing and should easily be denied based on more than one precedent in the same circuit. I say amusing because Internet Archive’s mission is a crusade predicated on the devout certainty that this part of its operation is permitted within the statutory exceptions for libraries. But just in case that argument fails, IA will plead fair use. I mean, yeah. It would be bad lawyering if they didn’t. But it’s still funny.

In the simplest analysis, IA’s conduct exceeds the boundaries established by the Second Circuit in Google Books and which was reaffirmed in ReDigi and TVEyes. The Google Books opinion begins with the statement, “This copyright dispute tests the boundaries of fair use.” And that was Judge Leval, who wrote the paper introducing the doctrine of “transformativeness” to the fair use analysis. There, the Google Books search engine, which necessitated digitizing millions of books, was held to be “transformative” under factor one of the analysis because it “added new utility” (i.e., a research tool the world did not have). But essential to that holding was the fact that Google Books does not make whole, in-copyright books available to the public. As the opinion states …

With respect to the first factor test, it favors a finding of fair use (unless the value of its transformative purpose is overcome by its providing text in a manner that offers a competing substitute for Plaintiffs’ books …

That “competing substitute” language is fatal to IA’s argument that it is “transformative” under the same “new utility” doctrine. In fact, because IA clearly provides a substitute for licensed ebooks, it cannot reasonably argue that it provides a new utility at all. It simply provides unlicensed digital books in lieu of licensed digital books. Indeed, enterprises more innovative than IA (e.g., TVEyes) have tried to argue “transformativeness” under the Google Books utility doctrine, and they have failed by the light of the same market substitute boundary.

When your only innovation is giving away for free that which the copyright owner intends to sell, there is nothing fair use offers as a defense. In the fair use analysis, there is always a strong interplay between factor one (purpose of the use) and factor four (potential market harm), but here the questions are almost identical because IA’s purpose is nothing more than market substitution.

So, Internet Archive will continue to make noise on Twitter and elsewhere. It will continue to portray itself the underdog, standing in the shoes of all librarians against the juggernaut of the publishing industry. And it will continue to elide or distort the authors’ interest in the narrative. But as a legal matter, for the reasons stated, I think IA should lose, and lose big. And that will be just fine for real libraries because real libraries do not engage in the conduct alleged in this case. And that will be the subject of a future post.


Photo source by: Janpietruszka

What Problem Do Those eBook Bills Address Anyway?

In late December, New York Governor Kathy Hochul vetoed the state’s library ebook bill, acknowledging that the law would be preempted by the Copyright Act. In mid-February, a district court in the State of Maryland, responding to a lawsuit filed by the Association of American Publishers (AAP), ordered a preliminary injunction suspending that state’s ebook law, also on preemption grounds. Recognizing which way the wind was blowing, Kyle Courtney of Library Futures Foundation drafted a letter on February 1 to the House Committee on Corporations of the Rhode Island State legislature proposing amendment to that state’s bill, writing:

…we are advising, based on the current landscape involving litigation and vetoes of similar eBooks laws in other states, that you consider friendly amendments below that will effectuate enough changes in H7113 to help avoid running afoul of the challenges documented below with respect to activities in other states.

What follows is a recommendation that Rhode Island remove one paragraph demanding that publishers license to libraries et al., which the footnote describes as the language in direct conflict with federal law. However, the remaining provisions of the bill still invite a preemption challenge because they presume to dictate terms and pricing models to publishers in conflict with the principle that copyright protects the author/owner’s right to decide the manner in which a work is made available. Hence, the provisions that would remain in the RI bill, as well as nearly identical bills in five other states, may still be construed as unconstitutional state compulsory licenses.

As Courtney’s letter emphasizes, the strategic approach taken by the various lobbying organizations pushing for these bills is to present the subject in the context of state contracts while seeking to remedy a consumer protection problem—namely, the alleged “unconscionability in licensing” practices by the publishers. But so far, the organizations lobbying for these bills have yet to support the accusation that current ebook licensing regimes are extortionate and/or that they are causing a disruption in a library system’s ordinary capacity to serve its community. And that’s to say nothing of presenting a compelling case in every state in which these bills have been introduced.

It is no surprise the American Library Association (ALA) et al. have not presented a thorough argument, because it would be a hell of lot of work. To assess whether a given market is underserved (in any context) requires a considerable amount of research and evidence, including counterfactuals, polling, budget analysis, etc. In this instance, it would be a rather large data-science project to manage and model all the relevant inputs, like overall reading trends, library-use trends, preferences for digital vs. physical materials, and cultural and economic data, to determine whether, and where, the ebook borrowing market is underserved and conclude that the licensing models are the cause.

Instead of doing any of that homework, what associations like LFF and the ALA have done instead is to compare the consumer price of an ebook purchase (e.g., $18) to a library price of an ebook license (e.g., $55 for 2 years), then cry foul and draft legislation to resolve this apparent injustice. But if state lawmakers are going to accuse the publishers of unfair practices to justify a law that flies in the face of the Copyright Act, it should demand more evidence than these two numbers alone. Or if state lawmakers are going to elide all complexity in favor of blunt metrics, then why not simply recognize that three times the price to make an ebook available to fifty times the readers hardly sounds like extortion by any reasonable definition?

The Mid-Hudson Library System

Although I certainly do not have the resources or data-science chops to do the kind of research mentioned above, I did a little digging into the Mid-Hudson Library System (MHLS), which serves my home region, just to see what I could learn.

One of 23 systems in New York State, MHLS comprises 76 small-town and public-school libraries in five counties with a total population of more than 686,000 (~ 258,000 households) earning a median income of about $76,000/year. The 2021 budget for the library system was just under $4 million, a little more than half of which comes from statewide and local taxpayers. In 2021, MHLS spent about $90,000 (2.25% of its budget) on digital lending materials, through a few different marketplaces, and presumably using more than one licensing model.

For example, OverDrive, one of the major marketplace platforms where librarians license digital materials, makes ebooks available under three different licensing models. Through Simultaneous Access, certain publishers offer package deals for multiple titles up to a certain number of loans. In the One Customer One Use model, presumably for back catalog or less popular books, the licenses never expire. And the model most often used by the major publishers for the most popular books is Metered Lending, which offers one or two-year licenses and/or limits the number of loans per license.

In 2021, MHLS ebook circulation was ~ 314,000, and the first three months of 2022 are tracking toward a similar total. Even at the unrealistic frequency of one book per unique patron, that would be less than 1/3 of the total population in the system, which likely says more about demand than it does about supply. In fact, at the national level, although ebooks and audiobooks continue to occupy a greater percentage of a library’s collection, print book borrowing is still 518.92% higher than ebook borrowing as of 2019.

Looking at the catalog, it appears that MHLS offers about 10,000 ebooks (70% fiction/30% nonfiction), presumably under more than one licensing model. But even if all 10,000 were licensed under Metered Lending at a rate of $55 for two years, this amounts to a cost of about $1.07/year per household in the system. Alternatively, we can estimate that a two-year license of $55, at a maximum rate of one loan every two weeks ($55 / 52 readers), is a Cost Per Loan (CPL) of about $1.06.

So, the numbers available do not seem to justify even a hypothesis that ebook licensing is unduly burdensome or is resulting in underserving the MHLS community. And the overall demand nationwide for borrowed ebooks hardly justifies the rhetoric of the lobbyists, who would have us believe that a literature-starved public is suffering on the libraries’ virtual steps at the mercy of the big publishers. When an expenditure is just over two percent of the operating budget, one must step back and look more holistically at the question presented.

Collections Are a Fraction of a Library’s Expense

The data collected in the Institute of Museum and Public Services (IMLS) Public Library Survey reveals that libraries’ costs are increasing for personnel and general operating expenses while costs are trending downward for collection materials—especially the cost of ebooks and audiobooks. Noting that most libraries spend an average 10% of their annual budgets on their collections overall, an article in Wordsrated summarizing the IMLS Survey states, “The drop in price per item is due to library collections becoming increasingly digital. This is because the price per digital item has declined significantly. All while the average cost per book increased 10% since 2003.”

The statistical trends in the IMLS Survey suggest that libraries are going through a lot of transition these days—as collections become more digital, as physical spaces are adapted to provide more programs and services, and as overall reading and borrowing habits continue to shift in the market. Change in any system presents both opportunities and challenges, and it is a safe bet that not every local library will, or can, adapt in the same way. But if the data show that ebooks are, as of 2019, “the cheapest material in a library’s collection,” then why on Earth is this the moment to lobby for these ebook bills in the states?

The answer to that cannot be, “Well, if the prices were even lower, we could do more.” Yeah. That’s how everything in life works. But for one thing, as much as publishers and authors care quite a bit about library patrons, it is not incumbent upon them to outright subsidize the libraries as they navigate the changing landscape—let alone by mandating that the publishers remain bound by old models so that libraries can adapt to new ones. That’s not a symbiotic relationship.

Looking forward, neither the libraries nor the publishers can say what the trends will be in five or ten years, but the libraries should be cautious about putting too many eggs in the ebooks basket. What happens to the relevance of the seventy or so local libraries in MHLS if the system plays an outsized role as a conduit for ebook lending? Don’t at least some taxpayers or prospective donors in each town begin to wonder why they need to keep paying the librarians and maintaining the buildings? Perhaps the local librarians should look at the data and ask whether ALA, LFF et al are doing them any favors.

Of course, knowing the track records of the people behind these ebook bills, it is fair to doubt that they are trying to solve a problem at all but are instead pursuing a broad, anti-copyright agenda. The tone of Courtney’s letter, for instance, makes clear that he (and his colleagues) object to the legal doctrines on which the NY and MD bills were opposed and that his recommendations to RI are a begrudging pivot in strategy to achieve the same ends by a slightly amended rationale.

But to oblige any copyright owner to make a work available under terms mandated by state law invites substantial conflict with federal law and the authority of Congress alone to amend that law. Consequently, no state legislature should embark on such an adventure without a compelling and thorough analysis of the problem allegedly being solved. And so far, the lobbyists for these ebook bills have presented little more than a melodrama barely worth reading at any price.

No, the Maus Ban is Not an Excuse to Weaken Copyright

Naturally, I join the outrage directed at any school board that would presume to ban a book—let alone because they don’t want students to confront the traumas of history—but I am almost as offended by the self-proclaimed defenders of culture in the anti-copyright crowd. How dare the McMinn County Board of Education ban Maus? But at the same time, how dare anyone write this?

Really? The survival of culture depends on libraries owning ebooks?

Yes, the tweet was posted by the same Maria Bustillos who inspired my last post about the library associations’ anti-copyright agenda, and I certainly do not mean to pick on her alone. On the contrary, I wouldn’t bother with that tweet if its fallacies were not endemic among organizations with the power to lobby legislatures. It is a sentiment within a hubristic narrative which asserts that, if not for copyright getting in the way, digital repositories like libraries would keep culture burning like a flame amid the forces of darkness. More specifically, Bustillos et al ask us to believe that libraries owning, rather than licensing, ebooks would serve as a hedge against censorship. But how?

If the Tennessee school board, and other officials who behave likewise, are indeed riding a wave of illiteracy toward authoritarianism, it is certain that those forces will not leave the libraries intact either. Moreover, if that is where we are headed as a nation (and there days we all wonder), forget the ebooks and prepare for civil war. But if that dire outcome is not what we are talking about, and we are instead witnessing just another sad example in a long history of bumbling, mouth-breathing attempts to ban books, then we can temper the “protect culture” language because it looks like the “evil” commercial market has got this one.

I admit it has been satisfying to watch the sentiment in that tweet wither in the sunlight of Maus topping best-seller lists in response to the Tennessee school board ban. Whether this groundswell is borne of curiosity to read a banned book or a desire to raise a middle finger at the censors (doubtless it is both), the entire narrative is an endorsement, not an indictment, of Art Spiegelman’s copyright rights. After the assault on the Capitol, I wrote a post reaffirming a claim I had made in 2013 that “A great bulwark against tyranny would be a class of unusually wealthy poets.” In principle, the consumer response to the Maus ban is exactly what I had in mind.

Libraries are wonderful institutions, but enemies of culture have a habit of burning them down. Or in the case of America’s public libraries, they can simply defund them as easily as they remove books from school curricula. Ebook collections in libraries are not a bulwark against that kind of wanton destruction, but empowering authors and artists with certain property rights in their work and, yes, money remains a powerful mechanism for keeping the philistines at bay.