Comparing AI Prompts to Button-Pushing on a Camera

Plenty is being said about AI systems that generate visual works, written works, music, etc. And plenty more will be said, especially now that lawsuits have been filed against some of the AI-generated image companies. In this post, I want to address a misconception about authorship in copyright law that may be warping the AI conversation. As I understand the argument, some AI proponents allege that the act of writing prompts is comparable to the act of pushing the button on a camera and, therefore, vests copyright rights in the proverbial “button pusher.”

Although it is possible to conceive a scenario in which this analogy might apply, it is important to first understand that the underlying premise (i.e., that button pushing establishes authorship in a photograph) is wrong. In fact, when photography emerged as the first machine-made work, it posed a challenge to copyright law that still provides an ideal context for discussing what it means to say that copyright protects creative expression the moment the author causes that expression to be fixed in a tangible medium. Note that the key ingredients are expression, an author, and fixation, and inherent to the process binding all three is an interval of human effort enabling the author’s concept (or vision) of the expression to be manifest as fixation.

With photography, the interval of effort may be stately or a mere fraction of a second, but copyright law does not discriminate between the photographer who carries a vision in her mind for weeks of preparation and arrangement and the photographer who captures a fleeting moment from real life. In both cases, triggering the shutter is the proximate cause of fixation,[1] but vesting copyright rights in the photographer is predicated on an assumption that, even in a fraction of a second, she made creative choices sufficient to find a modicum of original expression in the image.

Various Scenarios in Which It Is Not About the Button

In the case of a studio shoot with a lot of preparation, lighting, props, wardrobe, etc., the photographer may not even touch the camera very often. It may be mounted on a tripod with an assistant triggering the shutter from a computer or remote control while the photographer directs all the creative aspects that comprise the resulting images. Copyright holds unequivocally that this individual is the author of the photographs because it is his expression that is being fixed in each image, but the mechanical “button-pushing” is irrelevant except as a purely mechanical step in fixation.[2]

For the street photographer or photojournalist, the same principles apply, but copyright allows for the arguably metaphysical assumption that even in the tiny interval between seeing the real-life subject and capturing it, the photographer makes subtle choices that imbue the work with sufficient expression to be protected. Again, the button causes fixation but is not the basis of authorship, and this would be evident in the analysis of the content and qualities of the photograph, if it were to become the subject of a copyright infringement lawsuit.

By contrast, if a truly accidental photograph is captured (e.g., by a camera accidentally dropped from the Eiffel Tower), there is no authorship in that image—not because a human did not push the button, but because there is no colorable nexus between the human’s mental conception and the resulting photograph. On the other hand, if a photographer intentionally drops a camera from the Eiffel Tower and triggers the shutter by remote on its way down, copyright attaches to those images—not because a human pushed the button, but because a human conceived of the series of falling photographs and arranged the circumstances by which they could be made.

Although it is important to note that cameras are not machines trained with a corpus of existing photographs, this last example may be the closest analogy to the prompt directing the AI generator (in its current state) to make an image. If the prompt writer has a general sense of the image she wants to produce, but there is still an element of chance about what the machine will make, the prompt writer may argue that she is no less an author than the photographer who intentionally allows some element of chance into the process of making his images.

While this premise sounds reasonable as a general proposition, what it really implies is a case-by-case consideration as to how much human expression exists in the resulting works. Even in the example of the camera tossed intentionally off the Eiffel Tower, the photographer can control certain qualities in the images and may even have a vision for how they are to be used, displayed, or distributed. He knows the characteristics of the camera and lens and can select settings with the intent to control some of the qualitative results in the final photos.

By contrast, the prompter directing the image-generating AI is arguably not in control of enough of the qualitative elements in the final image to claim authorship—at least not at the current state of the technology. Entering the prompt “A mermaid wrestling a sea lion in outer space in the style of Cartier-Bresson” may produce an image that checks each of those boxes, but the prompt writer is not controlling the qualitative choices that comprise the result. Composition, line weight, shading, lighting, texture, scale, proportion, etc. are all “selected” by the AI based on what it has “learned” from the millions of visual works fed into its code, so there is a critical disconnect between the human’s vision of “A mermaid wresting a sea lion in outer space in the style of Cartier-Bresson” and the interval of effort that fixes the image in a tangible medium.

At some future state of the technology, the human may prompt a draft image to be made and then prompt changes to the qualitative elements, at which point it may be tough to deny that there is authorship in the resulting work. If these technologies develop in this way—such that the prompter is essentially painting with words instead of a stylus—this anticipates that, for instance, a disabled individual could truly create visual works with her mind akin to the way Stephen Hawking wrote books. But in this paradigm, the AI does not present a unique challenge to the concept of authorship because the human is in control of sufficient expression in the work.

Dynamic Ethical Standards

Of course, this theoretical discussion assumes integrity among individuals who claim authorship in various works. The guy whose camera accidentally snaps a photo does not have to admit he played no role in its making, and AI currently presents a similar challenge. The issue of integrity is a hot conversation we’re having in response to generative AI—especially in academia where ChatGPT is already “writing” papers for students. Notably, few people would question the judgment that the student who turns in a paper “written” by an AI is a cheat deserving the same sanctions as if he were caught plagiarizing. Yet, somehow, when the material is a “creative” work, AI advocates argue that the prompter is an author of a visual work comparable to a photographer using a camera.

This dichotomy can only be reconciled by confronting the fact that certain uses of AIs are not only not authorship but are needlessly destructive to the very purpose of intellectual and cultural endeavor. The student who shirks writing his own paper learns nothing and so, potentially graduates from a program unqualified. Likewise, the prompter using an image-generating AI is not an artist and contributes nothing to the purpose of art. Thus, while there may be uses for these systems, their potential cultural value depends on more than technological development for its own sake.

Because these technologies are still new and still primitive relative to their expected capabilities, it is hard to predict where the more serious aspects of the narrative will lead. Some of the generative AIs are barely more than toys at the moment (e.g., turning profile pics into oil paintings), but what they will do a year from now, let alone five years, will inform how we address the issues—cultural, legal, and ethical. For now, though, I insist that no, prompting is not equivalent to button-pushing with a camera, even if button-pushing were as significant as many people think it is.

[1] This is true with digital photography. With film, one could argue that the latent image on the negative is not fixation until it is at least developed because it cannot be perceived by either human or machine reader.

[2] And there are likely to be further steps like retouching or printing, which may fix the final version of the image.

Photo by author.

AI “Art” is Boring

Adam was bored alone; then Adam and Eve were bored together; then Adam and Eve and Cain and Abel were bored en famille; then the population of the world increased, and the peoples were bored en masse. To divert themselves they conceived the idea of constructing a tower high enough to reach the heavens. This idea is itself as boring as the tower was high, and constitutes a terrible proof of how boredom gained the upper hand. – Soren Kierkegaard (1843) –

I had not thought about Kierkegaard writing on the subject of boredom in years. The essay from which the above quote is extracted was a favorite in college for its biting humor, but something about Rogers Brubaker’s excellent article about democratizing culture sent me in search of my 38-year-old (ouch) copy of The Kierkegaard Anthology, and I think it was this paragraph of Brubaker’s which triggered the thought:

But the question is not just how many people engage in cultural production — it’s how people engage. The AI music company Amper promises to help customers “create your own original music in seconds.” The creativity involved is rather attenuated, amounting to editing and tweaking the music generated by the AI, but that didn’t stop Amper co-founder Drew Silverstein from evangelizing in a TED talk about how AI can “democratize music” by enabling “anyone to express their creativity through music.”

That promise to “create your own original music in seconds” was the portkey back to Kierkegaard. “In the case of children, the ruinous character of boredom is universally acknowledged,” he writes, and, indeed, I maintain that boredom is the inevitable outcome of AI toys promising to make music, visual art, poetry, etc. We have all experienced as children and witnessed as adults that transition between playing with a new toy and rapid disenchantment because the toy fails to engage the imagination. I am not the only Gen-X parent, for instance, to notice that when LEGO began selling kits to build branded objects like Star Wars spaceships, my own children would usually complete the assembly once and then be done with the toy forever. By contrast, my contemporaries and I spent hours with sets composed of bricks and no predetermined design.

Kierkegaard proposes that the plebian bores others and amuses himself while the aristocrat amuses others and bores himself—a dialectic perhaps well suited to describe the inevitable use of AI machines to “make one’s own music or art.” At the current state of the technology, the input of the human user is barely creative—little more than dropping a coin in a jukebox—and thus, all users similarly situated are plebian bores for the time being. The works resulting from their prompts may amuse them (for a while), but they will mostly bore others who will only be interested in “making their own music” with the same toys. Before long, a million individual users of the music generating AI will achieve a collective homeostatic boredom—a two-dimensional Babel leading nowhere.

Perhaps one of these accidental works will reach escape velocity, break through the gravitational force of mass boredom and “go viral” for a fleeting period. Some AI-generated ditty might be next year’s “Baby Shark” or even share the apotheotic luminance of a “Gagnam Style.” Someone will choreograph a short dance to accompany the tune, and TikTokers will fall in line to perform their versions, and Big Tech will look down and see that it is good, and their disciples will proclaim, “Behold the new culture! The human songwriter is an anachronism.” And it will all be as boring as it is ephemeral.

It is possible, of course, that generative AIs will become sophisticated enough to be collaborative tools wielded by the human artists—that the human still selects and arranges the creative elements to achieve her vision while the AI “helps” in some way. If and when we get there, we shall see. But in the meantime, it is clear that AIs do not need to be more sophisticated to replace some creative human work right now. My good friend Marco North writes on Facebook to me, “A full roster of AI voice talent costs less than $100 a month, works 24/7 and [will] do endless revisions….Voice work is perfect gig work for actors, say goodbye to lots of that.”

A gifted polymath in film, photography, music, poetry, and prose—Marco writes a weekly blog called Impressions of an Expat. Initially written from Moscow, he now writes from Tblisi, and in his latest post, he describes a happenstance encounter with the statue of Georgian poet Vazha-Pshavela (Luka Razikashvili) and his feelings about AI “art.” He asks:

Who will be the subject of the next statue? An algorithm? Will there be streets named after TikTok? Will we name a playground after a Spotify playlist curator? These are the people that tell our stories now. Midjourney highway will take you there. Take a left at ChatGPT square, you can’t miss it.

Yes. That is a vision of a possible future. Of course, if the tech giants can make the world just boring enough, then certain humans will do what certain humans do. They will disassemble the unengaging toy and turn it into something else—something called art. And then, the world will start to be interesting again.

Art is Human

A few months ago, I attended a local event, where photographer Doug Menuez spoke about his project “Wild Place: The People of Kingston, NY.” The description on his website begins . . .

Wild Place is the English translation of Wiltwyck, the original name given to Kingston, New York, in 1661 by Peter Stuyvesant and the Dutch who were facing fierce resistance from local Native Americans. My wife Tereza and I recently moved back to Kingston after a decade away and can see lots of changes, with more to come. It seems like an important moment.

Combining portrait and documentary in both photographs and short video interviews, “Wild Place” presents contemporary Kingston through Menuez’s view of its artists, activists, entrepreneurs, community leaders, and—not surprisingly—people who fit all those descriptions. While listening to Doug talk about the project, I was reminded why I care so much about artists and their work: because through art and artists, we renew profound, even cathartic, connections to what it means to be human and, in turn, reinforce the reasons why humans bother to make art. My schedule does not permit frequent attendance at such events, but listening to Doug’s articulate, thoughtful, even spiritual discussion about his work was as close I come to listening to a sermon.

In my last post commenting on visual works generators like DALL-E, et al., I reiterated the view held by many that the notion of “AI art” is oxymoronic—as devoid of meaning as having a machine perform a religious rite for its human owner. Whatever creative work without humans ought to be called, it is not art. As such, I maintain that nobody will be interested in works made exclusively by machines for very long and that the current buzz about these generative algorithms may ebb quickly into the sea of trends to swirl in gooey eddies of crypto and NFTs.

This is not to suggest that creators and advocates of creators’ rights should ignore current threats to human artists, or that generative AIs do not preface an even darker version of the “information age” than the present state of madness. In a Facebook post that has been widely shared, a philosophy professor describes catching the first student in his class to use a bot called ChatGPT to write an assigned essay about David Hume. “The essay confidently and thoroughly described Hume’s views on the paradox of horror in a way that were [sic] thoroughly wrong,” the professor writes. “It did say some true things about Hume, and it knew what the paradox of horror was, but it was just bullshitting after that. To someone who didn’t know what Hume would say about the paradox, it was perfectly readable—even compelling.”

That last sentence is unsettling in a world buffeted by conspiracy mongers and alternative facts. No Alex Jones or Donald Trump or Stewart Rhodes required. The next cult figure can be an algorithm producing a “readable—even compelling” restatement on any matter from the Enlightenment to the suppression of viral disease. It is intriguing, if depressing, that a college student attempted to cheat by means of an AI to avoid honest engagement with Hume’s essay Of Tragedy, which contains the following observation:

We find that common liars always magnify, in their narrations, all kinds of danger, pain, distress, sickness, deaths, murders, and cruelties; as well as joy, beauty, mirth,’ and magnificence. It is an absurd secret, which they have for pleasing their company, fixing their attention, and attaching them to such marvellous relations, by the passions and emotions, which they excite.

Hume could be commenting on the recently announced Trump NFT “trading cards,” which appear to comprise stolen images from the internet and badly photoshopped heads in a series of bizarre portraits depicting Trump as soldier, rancher, business leader, and even a costumed and be-muscled superhero with lasers shooting from his eyes. I got nothin’ except to say that there is no paradoxical pleasure in viewing this particular horror.

On a more sophisticated level, generative algorithms like MidJourney, DALL-E, and Stable Diffusion are all “trained” by inputting a corpus of human-made creative works, most of which are scraped from the internet without permission of any living artists who still own the rights to the works. As PetaPixel reports, MidJourney founder David Holtz flatly admits feeding his system millions of images without permission, and illustrator Molly Crabapple, in an OpEd for the L.A. Times writes:

While they destroy illustrators’ careers, AI companies are making fortunes. Stability AI, founded by hedge fund manager Emad Mostaque, is valued at $1 billion, and raised an additional $101 million of venture capital in October. Lensa generated $8 million in December alone. Generative AI is another upward transfer of wealth, from working artists to Silicon Valley billionaires.

That these AI “art” generators represent yet another example of economic destruction without the creative part is a certainty. Less certain are some of the copyright questions, for instance, whether input of protected works for “machine learning” is infringement. This will remain a theoretical/ideological debate for attorneys, academics, and copyright nerds like me until one of two things happens: legislation or litigation, both of which move at a crawl compared to the market for new tech toys. If a lawsuit began tomorrow, for instance, it would be hard to say whether the legal questions presented will still be relevant to the market by the time the case is resolved.

Perhaps the real potential of the generative algorithm lies not with illustration or design or music composition, but with medical diagnostics or some other valuable purpose. If computer science is a true science, then it must allow for unintended discovery, and who’s to say that an experiment in “AI art” cannot be the precursor to an algorithm that helps identify genetic disposition for certain infections?

This does not mean, of course, that we should excuse models in the present that undermine the rights or value of the human artist. On the contrary, I mention this alternate history to emphasize the point that of all the things we can do with computing power, one thing we absolutely do not need are machines that make “art.” Tellingly, Hume’s essay is mostly about art, and to the question whether creative expression about tragedy can provoke a sense of pleasure for the audience, he replies:

This extraordinary effect proceeds from that very eloquence, with which the melancholy scene is represented. The genius required to paint objects in a lively manner, the art employed in collecting all the pathetic circumstances, the judgment displayed in disposing them: the exercise, I say, of these noble talents, together with the force of expression, and beauty of oratorial numbers, diffuse the highest satisfaction on the audience, and excite the most delightful movements.

Maybe the AI cheerleaders will accuse me of anthropic maximalism, but in addition to doubting that an “AI artist” could ever express anything close to the transcendent experience Hume describes, I am certain that we do not want it to even try. Art is human. There are better uses for computers.

Photo by: Abrill

The Illusion of More

Dissecting the digital utopia.

Category: Art

Comparing AI Prompts to Button-Pushing on a Camera

Various Scenarios in Which It Is Not About the Button

Dynamic Ethical Standards

AI “Art” is Boring

Art is Human

Archives

Browse Topics