Really, DON’T Believe Anything You See on the Internet

When that cliché first entered our consciousness, it wasn’t really fair. The internet between the mid-90s and the mid-aughts wasn’t what it is today. It actually was just a dumb pipe through which content could could be delivered from creator to consumer in a new way. It was silly to imply that one should not believe a news story published by the Washington Post just because it was on a screen instead of  paper — and that principle still holds true for most professional journalism.

But now, every legitimate news source swims in the same stream with all the garbage—from raw clickbait to lazy aggregators to hackers purposely trying to exploit underlying divisions in democracies—and the tools of manipulation are so sophisticated that many of the manipulators themselves don’t have to be. With a little practice using software that anybody can steal, a kid can create a video that makes it look like Hillary Clinton said that “all veterans are pussies,” and…well, here we are.

“One of the things I did not understand was that these systems can be used to manipulate public opinion in ways that are quite inconsistent with what we think of as democracy.”

That’s what Alphabet (Google parent company) Executive Chairman Eric Schmidt said, recently quoted in an article on FastCompany. And in keeping with the theme of this post, I don’t know what to believe. Were Schmidt and the rest of the leadership at Google honestly so drunk on their own utopian rhetoric about how wonderful their systems are that they failed to imagine—to say nothing of observe—how their products could be toxic for democracy? Or did they recognize it and not care until they were forced to care amid the fallout from the investigations into Russian meddling?

Facebook’s founding president Sean Parker—he was also the co-founder of Napster—told Mike Allen of AXIOS in a recent interview that Facebook was designed to “exploit a vulnerability in human psychology” in order to keep people on the site as much as possible. Parker told Allen that the creators of Facebook understood what they were doing and did it anyway, though perhaps did not quite imagine what the results would be when a billion people voluntarily spend hours in Zuckerberg’s ant farm. “…it literally changes your relationship with society, with each other … It probably interferes with productivity in weird ways. God only knows what it’s doing to our children’s brains.”

How much has changed in such a very short time. It seems like only yesterday the cheerleaders of Silicon Valley, with all the confidence of Camel-smoking doctors, kept telling us just how good their products were for democracy and for society overall. All this goodness was packaged into a single generic word innovation, and anything that stood in the way of innovation—like maybe the rule of law—was bad. Now, all of a sudden, we hear a lot of “Wow, we had no idea our systems could be used to totally fuck up the world! We’ll get some people on that right away!”

Of course, these companies either will not or cannot fully address the underlying reasons why their systems can be so toxic; and Parker put his finger on it when he admitted that Facebook was designed to take advantage of human folly. Facebook may clean up its act in certain regards—I actually believe Zuckerberg wants to—and Congress may enforce some effective regulations upon these platforms; but none of this will address the flaw in human nature that makes us more susceptible to emotional triggers than we are to reason and information. That’s why the underlying promise of the information age—that information can only have a moderating effect on discourse and interaction—is proving to be untrue.

There’s something fundamentally harmful about taking complex topics and issues and transforming it all into advertising, but that’s essentially what a platform like Facebook or Twitter does. “The sad truth is that Facebook and Alphabet have behaved irresponsibly in the pursuit of massive profits,” writes Roger McNamee for The Guardian. “They have consciously combined persuasive techniques developed by propagandists and the gambling industry with technology in ways that threaten public health and democracy. The issue, however, is not social networking or search. It is advertising business models.”

McNamee, who is identified as an early investor in Google and Facebook, describes how the advertising revenue models of these platforms drive, for instance, Facebook to deliver content based on user preferences, creating feedback loops called “filter bubbles.” People have been writing about the filter-bubble problem for several years now, but I suspect the problem is far too subtle to expect that the platforms themselves, with or without legislative mandates, will solve it.

Amid the recent flurry of allegations of sexual assault, satirical posts have appeared on Facebook with photos of Tom Hanks and leads saying, “Dozens of women come forward to…” And then, you click on the story, and it completes, “…say that Tom Hanks is a really nice guy.” Variations on this gag appear all the time, like the reports that Keith Richards is still alive. But you can bet the beer money that any number of people just scrolling through a feed on their phone, perhaps waiting in the supermarket line right next the old-school tabloids, will come away with the impression that indeed Tom Hanks was implicated in some sexual abuse claim. Then, the rumor gets repeated to a friend, and that’s more or less the state of “information” in the digital age. It’s the National Enquirer at “Google scale.”

According David Roberts, writing for Vox, America is in the middle of an epistemic crisis, suggesting that at least many citizens are beyond the problem of separating fact from fiction and are instead living in a world in which facts simply don’t matter. It is a mindset he calls “tribal epistemology—the systemic conflation of what is true with what is good for the tribe.”

For the time being, analysis of the online media universe reveals this problem is more prevalent on the political right (see support of Roy Moore even if he did assault a teenager), but the political left is hardly immune to this kind of tribalism. In fact, this blog was inspired five years ago when I witnessed this exact behavior among left-leaning friends, who were willing to share false information because it supported the outcome they believed to be right. So, although it is somewhat encouraging that this year marks the turning point when internet platforms will no longer be given a free pass — either by lawmakers or the public — to simply do what they want “for the greater good,” that hardly addresses how we individually and collectively will learn to cope with “God knows what’s happening to our brains,” as Parker puts it.

Does the Internet Archive Need the Copyright Rhetoric to Be Useful?

Photo by fotoduki

Recently, a tweet caught my eye on the #copyright thread—something about the late Congressman Sonny Bono and a new collection at the Internet Archive, which is the vast digital library founded by technologist and entrepreneur Brewster Kahle.  The tweet linked to a blog post by Kahle announcing that a collection of copyrighted works published between 1923 and 1941 had been “liberated” and is now available on what the Archive has named The Sonny Bono Memorial Collection.

The eponym is not an honor of course. It’s a posthumous snipe at Bono, who is credited (or blamed, depending on one’s point of view) for the Copyright Term Extension Act (CTEA) of 1998, which added 20 years to copyright protection, resulting in the current term of life of the author plus 70 years.  Kahle’s post is bulked out with a lot of standard rhetoric condemning the duration of copyright, repeating the misleading narrative that Mickey Mouse was a major reason for the CTEA, and imply rather obtusely that the original 14-year, single-renewal term of 1790 ought to still be the law of the land (because of course the world functions much as it did in the late 18th century).  All that noise aside, however, the new collection with the sarcastic name is made possible by what Kahle calls, “a little known, and perhaps never used, provision of US copyright law, Section 108h, which allows libraries to scan and make available materials published 1923 to 1941 if they are not being actively sold.”

Let me interject to say that the Internet Archive is impressively handy.  I’ve already found a number of intriguing sources for a research project I’m just beginning; so, what follows is not an indictment of this or any other library, whether physical or digital.  I love libraries. But the reason for highlighting Kahle’s derisive tone is that it seems that there’s a lot of unnecessary conflict being sown between contemporary librarians and copyright law. And the crux of Kahle’s own announcement about this new archive underscores just how unnecessary the conflict is. To begin, Section 108(h) of the Copyright Act is not quite so arcane as he implies.

The statute was created as a specific carve-out for libraries, reflecting a compromise to attain passage of the term extension in 1998.  The exception allows a library to copy and make a work available during the last 20 years of its term of copyright protection, if copies of the work are not commercially available at a reasonable price. There are more conditions to the statute, but the underlying rationale is common-sensical enough. If these “Last 20” works are no longer available in the market—and high-priced, rare copies don’t count—then libraries are allowed to fulfill their mission by making the works available for the purposes of research and scholarship.  Meanwhile, the existence of the 108(h) exception, including proposals to amend it, actually rejects the attitude that Kahle and others sometimes adopt which ultimately pits authors against libraries.

New Scholarship on Section 108(h)

The challenges for a library wishing to apply 108(h) include research capabilities to learn the copyright status of works, and vagueness in the statute that can make proper analysis difficult. Enter Professor Elizabeth Townsend Gard, a copyright and history scholar at Tulane University. In collaboration with colleagues and students, she produced a 103-page paper, released this month, entitled Creating a Last Twenty (L20) Collection:  Implementing Section 108(h) in Libraries, Archives and Museums.

Gard’s paper offers two major contributions:  1) a comprehensive methodology for qualifying organizations to make effective use of the 108 exception in order to build what she calls “Last 20 Collections”; and 2) suggestions for possible revision of the statute in order to address what she sees as unnecessary gaps that leave organizations in limbo with regard to qualification and implementation. The heart of the paper is dedicated to methodology, in which she describes a taxonomic approach to identifying works, combining standard library cataloging systems with copyright data to yield the information required to know if a work is eligible for the 108(h) exception.  Given the amount of complexity involved, and the fact that I am neither a librarian nor an attorney, I cannot fairly comment on the system.

With regard to the statute, the US Copyright Office published its Model Statutory Language for revision of Section 108 in September of this year.  Gard commends some of the proposed changes and critiques others, making several recommendations that sound reasonable. For instance, she advocates better clarification of the extent to which the “used” market honestly represents “availability” of a particular work in fulfilling the purpose of 108(h). Some of her proposals may find critics at the USCO or among various stakeholders; but suffice to say, her work reads like a sensible foundation for compromise, which can be a rare find in contemporary discussions about copyright.

Making Section 108(h) Work is Not an Anti-Copyright Statement

Gard’s work represents a counterpoint, in my view, to many positions adopted by the ALA and related organizations, which have spent considerable energy aligning their interests with for-profit, technology companies in the hope of expanding—through litigation—the fair use exception and/or the first sale principle.  This approach seems both ideologically and pragmatically flawed, especially where the for-profit ventures clearly try to strain the underlying legal principles involved.

Libraries, archives, and museums, which exist for the primary purpose of advancing scholarship, deserve special consideration that is not accorded (and neither should it be) to for-profit ventures—or necessarily all non-profit ventures.  While the internet does create unprecedented opportunities for providing access to works that can lead to new areas of scholarship and new forms of creative expression, it also creates unprecedented incentive (i.e. crazy-big money) for various parties to try to blur the line between public-serving and private-interest ventures. Legitimate institutions of scholarship that ally themselves with this kind of vagueness are, in my view, working at cross-purposes with efforts like those of Professor Gard, whose proposals seek clearer guidelines for the types of institutions that deserve exceptions like Section 108(h).

To put this in context, there is nothing that necessarily bars a public-serving and privately-held platform like the Internet Archive (or Wikipedia) from becoming a monetized business venture, either independently or by selling all or some portion of its enterprise to a larger entity like Google.  Or if the Internet Archive were to earn revenue by selling its user data, this should run afoul of Section 108’s prohibition against using a “Last 20” collection to attain “indirect commercial advantage” for the archivist. I’m not saying the Internet Archive will do this, but if we keep in mind that indirect commercial advantage is the mechanism by which giant internet businesses make content “freely” available to the public, this awareness should inform any new statutory contours for an exception like 108(h).

Referring back to Brewster Kahle’s post, he quotes Carrie Russell, Director of ALA’s Program of Public Access to Information thus: “I’ve always said that the silver lining of the unfortunate Eldred v. Ashcroft decision was the response from people to do something, to actively begin to limit the power of the copyright monopoly through action that promoted open access and CC licensing.”  Eldred is the Supreme Court decision upholding the constitutionality of the CETA, and Russell’s statement here is frankly incomprehensible in a blog alluding to Gard’s efforts to make an existing, statutory limit on copyright work better.

Kahle himself seems unclear about the difference between the nuance in Gard’s work and his own desire to evangelize the bad-manners approach to copyright typically employed by Silicon Valley corporations. He writes, “Now it is the chance for libraries and citizens who have been reticent to scan works beyond 1923, to push forward to 1941, and the Internet Archive will host them.” That makes it sound as though Gard’s work just opened the flood-gates and that anyone should feel free to upload anything to the Internet Archive as if it were YouTube. Does this mean the Internet Archive will then do the 108 analysis before hosting, or that they’ll just duck behind the safe harbor of the DMCA?   Either you’re an entity that responsibly qualifies for the 108 exception, or you’re an ideologue eager to stick it to rights holders.  You can’t be both.

Professor Gard’s work strongly highlights the fact that carve-outs for libraries already exist in the copyright law; and where these statutes may not function as intended, they can be amended through good-faith collaboration with the USCO, stakeholders, and Congress.  To achieve this collaboration, however, the librarians and archivists would do well to tone down some of the rhetoric implying that the interests of preservation and research are incompatible with the interests of authors. It is plainly absurd for librarians and authors to be at odds, even in the digital age.

Fake News Tops Results After Las Vegas Shooting

On Monday, I was up early and first heard about the Las Vegas shooting on the radio in the car. It was still dark, and the winding road thick with fog, lending an eerie mood to the sound of Scott Simon’s voice on NPR reporting what little was known about this latest incident in what is now an epidemic of mass-killings. I had yet to look at any social media, to read anyone else’s opinion or to have the raw facts of the tragedy synthesized through the narrative of gun control, mental illness, terrorism, or any other matter of public policy. There was just the horrible truth of what had happened without theory or explanation. This is how we used to digest the news: Here’s what we know so far. Stay tuned.

Social media abhors a vacuum. And in the hazy interval between breaking reports of an event like the Las Vegas spree-shooting and the revelation of salient, credible details, the pranksters, trolls, and professional liars come out to play. Brianna Provenzano, writing for Mic.com, states that for several hours, “Facebook and Google’s algorithms prioritized fake news” about the Las Vegas shooting. As she puts it “conservative conspiracy sites like the Gateway Pundit lit up with misinformation about the shooter’s identity.” Her article shows one example of a headline naming some poort guy who had nothing to do with the shooting, calling him a “Democrat Who Likes Rachel Maddow, MoveOn.org, and Associated with Anti-Trump Army.”

According to Provenzano, the Gateway Pundit story was among the top results on Facebook before it was removed, but also that once the innocent man’s name was out there, Google searches for it led readers to a 4Chan thread “labeling him a dangerous leftist,” Provenzano writes. She also reports that Google eventually made algorithmic adjustments to replace the 4Chan story with relevant results and stated it will continue to be vigilant in this regard.

It’s right that Google and Facebook took action to quash, or at least mitigate, misleading “news” about such a gravely serious incident, especially bogus reports naming an innocent man as the perpetrator. But for those of us regularly following the policy positions of the internet industry, the hypocrisy here is not missed. For instance, Google can clearly take remediating steps where to no do so would look bad for them; but in other contexts in which search results may facilitate harm, they will expound ad nauseam upon the sanctity of free speech as a universal rationale to leave all data exactly where it is.

For instance, regarding the Equustek case and the Canadian court order to remove links, I fail to see a substantive distinction, in a speech context, between a counterfeiter using search to hijack customers from a legitimate product-maker and a counterfeit news-maker using search to hijack readers from legitimate reporting. In fact, ironically enough, a bogus news story, harmful and revolting as it may be in the wake of a tragedy like Las Vegas, has a better claim to speech rights than a hyperlink which leads consumers to a product or service that is breaking the law.

So, it’s not that I think Google et al shouldn’t make decisions to remove or demote “news” emanating from the adolescent babooneries of places like 4Chan. They absolutely should. Fake news is toxic, and we have enough problems with grim reality without people inventing and believing bogus narratives. But as I’ve argued more times than I can count, speech cannot be the default rationale for a universal laissez-faire policy in cyberspace. And as this story demonstrates, it’s a lie anyway. The major web platforms can and will manipulate, delete, or demote content, or links to content, when they are motivated to do so. Whether these internal decisions are driven by revenue, public relations, or even altruism, speech-maximalism does not seem to factor into their thinking, so there’s no reason why it should necessarily factor into external motivations like a court order.

Meanwhile, we can’t expect Google and Facebook to stop people from being idiots. Readers may remember that after the Boston Marathon bombing in 2013, netizens took it upon themselves to play law enforcement. Not only did they vilify an innocent man whose whereabouts were unknown, but the cyber-mob soon harassed the man’s family, who would then discover that they young man was missing because he had committed suicide.

In the early days of Web 1.0, I rejected the old cliché Don’t believe anything you read on the internet because, of course, the internet really was just a conduit, and a credible source is a credible source. But now that there’s such a bounty of absolute garbage that can either be designed to look legit or can be algorithmically elevated to undeserved prominence, that I think skepticism should be the default approach to nearly every headline. So far, the “information revolution” is at least half oxymoronic. And part of the problem is that it can be very hard to know which half.