NYT tech editor Jeong sticking copyright criticism where it doesn’t belong.

Holy whiplash segues, Batman.  There I was reading a perfectly interesting article by Sarah Jeong on the potential hazards of selling one’s personal data, when she took an incomprehensible—if mercifully brief—detour into the realm of copyright law.  She presents a reasonable enough case that the companies now offering to help us “broker” our private data (e.g. health information) may be counting on the fact that, “There’s no legal property right to personal data.  Once personal data is gathered, it’s out there for anyone to buy and sell. At the moment, there are no legal grounds to demand compensation for use,” Jeong writes.

Fair enough.  It is certainly true that the whole prospect of selling private data, even if it were a good idea, does implicate a relatively novel legal framework.  And while I am personally inclined to agree with Jeong that the whole notion is fraught with hazards, I am at a loss to understand where she is going with this interjection …

“In any case, we already know what happens when property rights get slapped on information, because we’ve already done it, to some degree, in copyright law. 

Giving people ownership of their creative expressions means they can buy and sell them on the open market. The risk is that an artist will wind up, like Taylor Swift, alienated from her own work because she no longer possesses the masters of some of her earlier recordings.”

Swift in late June stated publicly that she was very disappointed to learn that mega-star manager Scooter Braun will be acquiring Big Machine Label Group, which still owns her master recordings dating back to the start of her career.  Swift calls the prospect of being under contract to Braun her “worst nightmare,” and for the sake of this post, we will take her word that he is an “incessant manipulative bully” because digging into that backstory could not matter less to Jeong’s ham-fisted allusion to the supposed problem with copyright.  

Even more bizarrely, Jeong happened to pick an artist who has adamantly defended both her own rights and those of much smaller artists, and who told Rolling Stone in 2014, “Important, rare things are valuable. Valuable things should be paid for. It’s my opinion that music should not be free, and my prediction is that individual artists and their labels will someday decide what an album’s price point is. I hope they don’t underestimate themselves or undervalue their art.”  So, I’m just spitballing here, but maybe Swift did not recently do an about-face on the purpose of copyright, or even abandon all prospect of working with labels, so much as she was just saying she really does not like Scooter Braun.  

Turning to Jeong’s implications about the nature of copyright, it is clear that she should refrain from the topic altogether.  For one thing copyright does not “slap property rights onto information.”  Quite the contrary.  There is in fact a long history of statutory development and caselaw that makes it very clear that information is not the subject of copyright.  Expression is the subject of copyright, but the way Jeong slaps these two sentences together makes it seem as though information and expression are the same thing—especially in the context of an editorial that is all about data, which has no resemblance to expression.  

At that point, I guess what Jeong is trying to say is that if we can own and sell our data, then, like Taylor Swift and her masters, we could wind up very unhappy about the party that buys the data.  I think that disappointment is almost a guarantee and that we should be shoring up statutes against privacy-invasion rather than looking for ways to market our DNA profiles and whatnot.  But, that said, what in blazes does the unprecedented challenge of mass data collection and its privacy implications have to do with about three centuries (though I would argue more) constructing a legal framework for authorial rights?  Not a damn thing.

Interestingly enough, the paper written by Samuel Warren and Louis Brandeis in 1890, which is widely considered the seminal American work articulating a right of privacy, actually turns to copyright law as starting point.  Because there is no constitutional declaration of a right to privacy Warren and Brandeis begin with the already long pedigree of copyright in unpublished works when they write, “From corporeal property arose the incorporeal rights issuing out of it; and then there opened the wide realm of intangible property, in the products and processes of the mind.”  

Not only do most people, and certainly most creators, still feel that the products of the mind are a form of personal property, but this was the exact point of reference chosen by a pair of legal lions to make the case that a right of privacy actually exists.  Consequently, Jeong might want to consider the possibility that copyright law provides guidance for the protection of our personal data rather than a warning of what can happen if we become the “owners” of that data.  Or, if we’re looking for warning signs in historic property rights regimes, my friend Neil Turkewitz observes

“If property rights are the model, then Silicon Valley’s dismal track record on intellectual property rights is a giant red flag that simply vesting property rights is of little consequence to the extent that such property rights are essentially unenforceable — particularly for individuals. Since the dawn of the internet, notwithstanding their legal rights, creators and innovators have had to endure an avalanche of illegally available copies of their works online.”

So, maybe, as Warren and Brandeis noted, copyright does have something teach us about privacy that is quite different from Jeong’s misguided assumptions. But what do I know?  I’m just spitballing.

Public Knowledge wants to solve the misinformation problem? That’s adorable.

On Tuesday, Meredith Filak Rose of Public Knowledge posted a blog suggesting that a solution to rampant misinformation is to “bring libraries online.” Not surprisingly, she identifies copyright law as the barrier currently preventing access to quality information that could otherwise help solve the problem …

“High-quality, vetted, peer-reviewed secondary sources are, unfortunately, increasingly hard to come by, online or off. Scientific and medical research is frequently locked behind paywalls and in expensive journals; legal documents are stuck in the pay-per-page hell that is the PACER filing system; and digital-only information can be erased, placing it out of public reach for good (absent some industrious archivists).”

Really?  We’re just a few peer-reviewed papers away from addressing the social cancer of misinformation?  Back to that in a minute.  Because first, there’s a spit-take that needs cleaning up after reading that Public Knowledge??? Is weighing in on misinformation???  This is an organization that has recklessly spread nonsense of Augean proportions about copyright law.  See posts here, here, here, here, here, here, and here; or just read my last post citing PK’s Shiva Stella just plain making stuff up about the CASE Act.

The funny thing is that Rose does a pretty decent job of summing up how misinformation can be effectively deployed online, but her description could easily be the Public Knowledge Primer for Writing About Copyright Law:

Misinformation exploits this basic fact of human nature — that no one can be an expert in everything — by meeting people where they naturally are, and filling in the gaps in their knowledge with assertions that seem “plausible enough.” Sometimes, these assertions are misleading, false, or flatly self-serving.  In aggregate, these gap-fillers add up to construct a totally alternate reality whose politics, science, law, and history bear only a passing resemblance to our own.

Right. Kinda like when Stella alluded to the “secret entertainment industry” behind the development of the CASE legislation? Or when the organization claimed in August of 2018 that the “entertainment industry” was trying to sneak a copyright term extension into the NAFTA renegotiations? Those are indeed plausible tweets for anyone who is not expert in copyright law to believe—especially because it feeds what Alice Marwick calls as a “deep story” that behind every copyright policy initiative is a Hollywood bagman.  

Having said all that, Meredith Rose’s article does not say anything categorically false. It is a sincere editorial whose main flaw is that it is sincerely naïve.  “…in the absence of accessible, high-quality, primary source information, it’s next to impossible to convince people that what they’ve been told isn’t true,” she writes.  

Yeah. That psychological human frailty is not going to be cured by putting even more information online, regardless of how “good” it may be, or how copyright figures in the equation.  On the contrary, more information is exactly why we’re wandering in a landscape of free-range ignorance in the first place.  It’s why anti-vaxxers have grown in numbers and brought back the measles; it’s why climate-change deniers get to hold public office and reject scientific data; it’s why even the President of the United States can make public statements that are demonstrably false and tens of millions of citizens don’t give a damn.  There is more than sufficient freely-available, factual information online right now, all produced by professionals and experts on every subject under the sun, and yet this bounty has not mitigated the steady encroachment of flat-earth lunacy into the mainstream conversation.

Speaking as someone schooled in what we might call traditional liberal academia, I believe Rose reiterates a classically liberal, academic fallacy, which assumes that if just enough horses are led to just enough water, then reason based on empirical evidence will prevail over ignorance.  That’s not even true among the smartest horses who choose to drink. Humans tend to make decisions based on emotion more than information, and it is axiomatic that truthis in the eye of the beholder.

But if galloping bullshit is the disease, the catalyst causing it to spread is not copyright law keeping content off the internet, but the nature of the internet platforms themselves.  By democratizing information with a billion soapboxes it was inevitable that this would foster bespoke realities occupied by warrens of subcultures that inoculate themselves against counter-narratives (i.e. facts) with an assortment of talismanic phrases used to dismiss the peer-reviewed scientist, journalist, doctor, et al, as part of a conspiracy who “don’t want us to know the truth.” 

And let us not forget the extent to which the promotion of bullshit is big business.  Sure Cambridge Analytica made headlines.  But what about the friendly-looking spin-off from Open Media called New/Mode with its happy icons and upbeat mission statements about community and transparency? The cognitive dissonance needed to square those values with the deployment of “one-click calling” and “tweetstorms” is at the heart of the problem Rose and her friends at Public Knowledge are not just overlooking, but helping to foster.  

Social-media activism is designed to trigger the most Pavlovian of emotional responses and overwhelm reasoned debate with numbers.  Messages are simple and truth is rare, regardless of source or agenda. Reason cannot defeat such tactics.  We could upload all the well-founded science ever written, and it would barely be noticed in sea of hashtag nonsense many people would prefer to believe.  Public Knowledge knows this quite well, having availed itself of these tools and/or celebrated the efficacy of spreading misinformation—at least about copyright law. 

If Meredith Rose and her colleagues believe there are unreasonable copyright barriers to certain material, they should make that case on those merits alone and let others respond accordingly. Framing the topic as a broad solution to the effects of toxic and misleading content is too ambitious an overstatement for anybody to make, and far beyond the credibility of Public Knowledge to assert any authority.


Photo by cynoclub

Is “Machine Learning” Copying or Reading?

machine reading

I recently attended a round-table discussion on the subject of artificial intelligence and copyright.  The first of several engaging topics I thought warranted a post was the question of “machine learning,” which I put in quotes here with respect to one scholar who admonished against anthropomorphizing AI by using words for human activities to describe the actions of computers.  I think that view is fundamentally correct, though there is also grounds for analogy, as will be made clear by the following premise:

When you read a book, even if we might say, by way of analogy, that you are “copying” the content of that book onto your brain, this clearly does not infringe §106(1) of the copyright law proscribing unauthorized copying.  Since the author naturally hopes that you will read her book, such a prohibition would be absurd, even if you had an eidetic memory and could, if prompted, recite the entire work verbatim.  But if you used that gift to type from memory the entire book and made that document available, you would then violate more than one statute under the copyright law.

So, the question raised in regard to “machine learning” is whether the computer scientist who wishes to feed a corpus of books—say the anthology of American literature—into an AI should be required to obtain licenses for the works still under copyright.  Thus, the first analysis is whether the act of “copying” can be said to occur in this circumstance any more than it would be for the human reader who consumes the same body of literature.

It strikes me that if what the AI does in this case is ingest the corpus of books and almost instantly deconstructs those works by synthesizing them through a neural network, then the computer scientist has a pretty solid argument that no copying has taken place.   If the machine does not retain intact copies of works—or even large sections of works—-with the purpose of making those intact copies available to the human market, then this “machine reading” process is arguably analogous to the human whose reading does not infringe §106(1) of the copyright law.

That said, intent of the computer scientist may be a significant factor.  For instance, if the training of the AI will have a commercial purpose, this may suggest a requirement to license the works under copyright.  But intent can be very tricky on the leading edge of science because it is neither realistic, nor even desirable, to insist that every researcher know exactly where his experiments will lead.  This would nullify the process of discovery whence many great achievements have been made; hence, discovery is justification itself, and I suspect the tech companies would appeal to this rationale in regard to “machine learning.”

If the computer scientist’s goal is to see whether he can get his AI to “learn” about the American experience through literature, but he does not have a particular product or service in mind at the outset, it seems that copyright owners would be on fairly shaky ground to enjoin his use of the books.  As long as nothing that comes out the other end looks like any of the products that went in, it strikes me that this experiment exists beyond the statutory framework of copyright law.

Of course this portrait of the individual scientist beavering away in his modest lab to see what he may discover is not what is taking place in reality. We know perfectly well that major AI experimentation occurs in the R&D labs of companies like Google and Facebook, who are well shielded by trade-secret law from divulging what they are working on or for what purpose.  Like any other corporations, they are free to announce a new product or service without telling the public how they arrived at the latest result.

So, even if the use of copyrighted works as source material resulting in a commercial end might recommend some type of licensing regime, it may be very difficult to identify the threshold when the blind process of scientific discovery becomes a clear intent to exploit a commercial opportunity.  And, as mentioned, these companies would be under no obligation to divulge that eureka moment to anyone.  

On the other hand, the moment Google or Facebook did announce that new product, rightsholders could justifiably complain that a massive, highly-profitable corporation has used potentially billions of dollars worth of material without paying for any of it.  As one scholar at the round-table noted, tech companies may not use raw silicon for free, so why should they get to exploit millions of creative works for free, no matter what they’re turning that data into?

It’s a good question.  One that would seem to suggest a new subsection of the copyright law, and this would certainly be consistent with the fact that new forms of exploitation of works may demand equally new forms of compensation.  If nothing else, that type of statutory response could spare us all the tedious and false harangue that insists “copyright owners just want to stand in the way of innovation.”

That argument prevailed for far too long, and now the so-called innovators have a lot of splainin’ to do about their culture of blind disruption for the sake of disruption. Especially in light of the fact that AI may have some very profound effects on society as we know it, maybe this time around the copyright owners should be treated like experienced voices in the conversation rather than canaries wasting their breath in the proverbial coal mine.