Some Observations and Predictions on AI-Generated Art (part 1 of 2)

In this first installment, we'll start off with some basic, simple, no-frills discussion. And in listicle form, too!

Oct 30, 2023

Stop taking our jobs, you scab! You picket-line-crosser! You crusty old bitch!

I’d like to speculate a bit about AI-generated art, since the topic has been disturbing so many people for the past year or two.

There is no question that computer generated art creation is going to have a major effect on how we not only engage with idle entertainment in the coming future, but how we receive any information at all. Perhaps as soon as a decade from now, it will have forced us to reconsider so much that we had previously taken for granted regarding “art,” “entertainment,” “creativity,” and so forth.

And I’ve seem some good guesses floating around social media as to what specifically will happen – ideas that make sense to me. However, some of these ideas aren’t well developed, and the people making plausible predictions don’t always seem sure as to why they could be right. So I’ll list some, but with my own reasoning added, providing an explanation as to why I think they make sense.

1.) AI-powered editing will change how we interact with previous works of art, causing people to forget what the original version of various things were like.

Even before various photo/video digital image filters started being labeled “AI,” older productions have been edited for various reasons. Shakespeare has been Bowdlerized as far back as the 19^th century. Overly long novels and nonfiction works have been abridged. Movies shot in a 16:9 aspect ratio were often formatted for 4:3 when that was the standard for televisions, and now things shot in 4:3 are increasingly cropped to fit with a 16:9 aspect ratio because that’s now the TV standard. Star Wars was edited so that Han didn’t shoot first. Disney’s Fantasia was edited so that a black female character was removed entirely, leading to some nonsensical animation sequences. Recently, a whole bunch of inoffensive words and phrases in various Roald Dahl books were perceived as offensive, and they’ve been replaced with equally inoffensive but more politically fashionable words and phrases.

Artificial intelligence is just one more tool to assist man with his desire to make the past fit his present circumstances. But it will go quite far. To start with some basic observations: “AI upscaling” is now taking standard definition video and artificially making it high definition pretty convincingly. Soon, we’ll likely have the technology for computers to “guess” reliably what a video shot in a 4:3 aspect ratio would look like if it were broadened to widescreen. AI color correction and other various filters also seem fairly promising. But beyond the mere cosmetic, on the censorship front, there will be plenty of opportunities for editors to do whatever they please. Imagine Al Jolson in The Jazz Singer with his blackface makeup removed. Imagine Breakfast at Tiffany’s with another actor besides Mickey Rooney playing the Jap. Given the emphasis on “ethics” that I see in so many AI discussions among self-important “thinkfluencers,” there’s no question that AI will be used to cloud and obscure racial and sexual conditions from art’s past. But alternatively, there will be plenty of opportunity for people to go back and alter things to make them more offensive. Also: attempts at restorative editing will be made. That is, edits will be done by people who want to restore what they feel was the artist’s original vision against the conflicting visions of the producers.

This kind of editing, moreover, will become so easy and commonplace that it won’t require any official studio re-release to do it. And all of this freedom will lead to a lack of certainty as to which version of a film (or other work of art) ought to be considered the most definitive. In some ways, it will mark a return to the medieval period, in which a single chivalric romance could have dozens of different versions all based on how a given monk would choose to copy from his manuscript archetype, since he might swap out one phrase with another, or juggle the scenes around. There are some medieval chivalric romances that don’t have a single definitive edition precisely for this reason: there is none, and there couldn’t possibly be one since the earliest manuscript archetypes are irrecoverable. I suspect that quite a few books, films, comics, etc. will wind up with this same uncertainty despite having none before.

2.) Lots of small visual art (illustration, graphic design, etc.), writing, and acting jobs will go away, and artists will need to seek alternative avenues for steady income.

This one is fairly self-evident, and it’s already happening. AI image production has proven it can provide content for a bunch of minor jobs, like images for fliers, advertisements, web sites, and metal album covers. And commercial writing will also be greatly displaced. Aspiring illustrators or writers will need to find other avenues in which to cut their teeth and gain experience, and it is already turning into a political question. Over the past few months, both writers and actors have been protesting for better regulations regarding AI. Background actors and extras in particular could be completely replaced as a workforce by CGI guys themselves based on body scans of extras from previous movies. What a depressing subject! Let’s move on.

3.) The majority of art/entertainment will be more immersive and participatory than ever before.

Popular media has already been going in this direction for a while. The rise of video games is an obvious example that doesn’t require much explanation. You participate in the game by controlling the player, and you immerse yourself in the action by experiencing things through his perspective. And the logic of video games, in turn, increasingly pervades various media that we were hitherto thought to consume only passively. Here is probably the clearest example of how video game logic has become intertwined with other media: the first adventure game for computers came out in 1976 for the PDP-10 mainframe computer. It’s called Colossal Cave, and it’s entirely text based. You simply read about a scenario involving a spelunking expedition, and then you type in prompts for what you want your character to do, and it goes on like that. Three years later after its release, the first Choose Your Own Adventure novel came out, which has a remarkably similar underpinning logic. The electronic game preceded the paperback series. I’m not suggesting that the novels were inspired by the adventure game (the idea for them had been floating around for quite a while), but the reason they both coalesced around the same time is because the intrinsic features of electronic media had been massaging the public to accept increasingly participatory media, even in older media forms.

The logic of video games has also been increasingly prevalent in horror films. In the 00s, horror began to morally implicate the viewer in what he’s watching, forcing him to consider various ethical quandaries and place himself in the headspace of the victims, much the same way the most critically lauded video games sometimes compel the player to make ethical decisions that later play a role in the game. In the Saw franchise and more recently The Squid Game, victims are meant to engage in ethical decisions with warped scenarios reminiscent of Cold War-era game theory problems. It isn’t enough to feel bad for the characters as they struggle to survive, or even think to yourself, “Oh, don’t go in there! Don’t go upstairs! You fool!” while observing the soon-to-be victim as you would with slasher films. Instead, you’re invited to ask yourself, “What would I do in this situation if it were me?” Or, in some more ambitious stories, you’re invited to question your own role as a participant while the film breaks the fourth wall, as seen not only in the movie Funny Games but also in the video game Bioshock. “By engaging with this, have I been part of the problem?” you’re prompted to ask.

(I should mention that the successful Resident Evil films, based on actual video games, are also participatory in their own way, but in an older, more time-tested one: inviting people in the theater to yell things at the screen, not taking the movie particularly seriously. In this sense it’s more like the “midnight movie” phenomenon of the 1970s. So this isn’t too remarkable; if anything it’s a regression.)

Anyways. Visual media has also been increasingly immersive due to its shooting style. While the point-of-view camera in the original Halloween film (1978) was bold and innovative for its time, it’s not nearly so impressive now. Cameras are far more “in the action” than ever before, lacking the sort of detachment they traditionally had in the twentieth century. Rapid cutting from shot to shot has the effect of disorienting the viewer, forcing him to emotionally identify with the apparent disorientation of some distraught character. Unusual camera angles often achieve the same effect. This kind of immersiveness is probably influenced by the cost-efficiency of digital video as opposed to film. In pornography, for a similar example, point-of-view sex scenes became way more prominent starting in the 00s, since any doofus could now buy a cheap, lightweight digital camera and shoot one. “Gonzo” features, which exploded at the same time, would place the director more immediately into the sex scene, so that the production and the plot became one and the same. Try this: watch a dozen horror and/or pornography movies from the mid-1970s, observe the filming style, compare them to what’s around today, and it will become rather clear that even the most visceral, exploitative films were more voyeuristic than immersive. As a viewer, you would occupy the position of a fly-on-the-wall. Not so these days.

Lastly, even the runtime of various media productions indicates an increasingly participatory quality. Podcasts, pro wrestling pay-per-views, pornography scenes, and webstreams are often really long now. The reason is likely because the producers realize people will not want to watch all the content, so they’ll skip around, picking and choosing which parts they’re interested in seeing.

All of this has little to do with artificial intelligence directly, but it suggests a slow evolution of public taste that clears the way for a kind of entertainment in which people pick and choose various images and plot situations, often identifying with some character directly. It looks as though artificial intelligence will further the progress of what has been a gradual development.

4.) Most people will regularly interact with at least some AI art that had no human authorship except for the prompts they put into a computer.

Probably the biggest reason I agree with this assessment is in how formulaic a great deal of popular entertainment is now, not even at the level of plot development (which has always been formulaic) but right down to the most minute details of how the characters express themselves. Anime, which functions as a sort of emotion-simulating device for autistic people, is a good demonstration of where things are headed. Everything is heavily cued. During the emotional scenes, the same kind of orchestral string-based minor-key music always plays. During the comedy scenes, the characters’ features become distorted and their voices become loud and exaggerated. The theme or end credits music does not need to have any relevance to the actual show. The emotional tone can often switch incoherently from one place to another, creating no real structural consistency, even though each emotional situation itself is always highly formulaic and driven by cliché. To put it bluntly: the majority of anime starting around 2001 might as well have been AI-generated.

Moreover, since entertainment is so vast and limitless in its output and range of options, a product’s success increasingly relies on its search-optimization via the use of easily-recognizable keywords. Thus, the keywords increasingly determine the content. So, for instance, if I produce a porno scene, I want it to fit with as many keywords as possible, so that if someone searches for “redhead,” “stepsister,” “gangbang,” “teen,” “squirting,” or “schoolgirl,” there will be a chance that they’ll find my scene. So therefore, I make sure my scene features all of those things! Anime is the same way. “Harem,” “mecha,” “high school,” “shonen,” “isekai,” “magical girl,” etc. It is quite easy to imagine how the keywordification of entertainment will pave the way for AI narrative art generation, since we’re already encouraged to search for what we want in terms of keywords, and AI prompts are essentially the same thing.

5.) AI-generated art will continue to tribalize the public in the same way the internet has done.

When people talk about AI and its effects on art, the discussion sometimes emphasizes AI as an atomizing force. A single individual decides to watch a video, so he pushes some buttons, sets his preferences, and watches it: a piece of media entertainment perfectly tailored to his highly idiosyncratic needs and desires. Then, all the other individuals in his home do the same thing, each watching their own little unique movie. And then, before you know it, everyone in the world is gazing at his or her own AI-generated thingamajig, perfectly alone, isolated from the rest of society, staring at their own preferences regurgitated back to them forever. Man is transformed into Narcissus; glassy-eyed and with a trickle of drool falling off his chin. The Machine secures Its ultimate victory.

In reality, people are not nearly so interesting as to require their own individualized content. It seems that many of the people who make predictions about AI-generated art forget that humans are essentially social beings, and providing some conversational lubricant has always been and always will be one of art’s major functions. When people generate a work of art with a computer, they will want to share it with others, and some generated works will seem interesting enough to satisfy more than one person. Online communities have already formed around various interests, and AI-generated artworks will strengthen those communities, giving them the possibility of creating some sort of miniature canon for themselves. While certainly some uniquely idiosyncratic artworks will be made and promptly discarded (probably things that would probably fall under the label “guilty pleasure”), artistic expression is far too valuable a tool for group cohesion to discard it altogether in that context.

Additionally, AI-generated content will probably have a pronounced ethnic character to it as well. In the beginning of this discussion, I talked about how AI-generated editing will jeopardize the notion of authoritative versions of various pieces of entertainment. But it will also be used as a tribal aide, particularly along racial lines in multicultural societies. I could easily imagine, for instance, a future in which a Latino teenager decides he wants to watch Terminator 2 for the first time, and so he goes online and finds some “Latino version,” with Schwarzenegger being replaced with a Latino, never having seen the original. This may sound silly, but it really seems quite possible, even likely. And if it does indeed become this way, it will extend the ongoing problems for the preservation of some nationwide shared cultural memory that the internet has already largely created.

The Dumbest Prediction

Now, having said all this, there is one prediction I find uniquely dumb. It is that all art will be replaced by computer-made AI content – that AI will destroy man-made art once and for all. "I bet you fools still think that we will still have human creativity thirty years from now!” says the AI soothsayer. “Look at you! Just clinging pathetically to the last vestiges of your failed romanticism! Ha, ha, ha!"

Now, it would be fascinating if that were true, but the claim makes far too many assumptions regarding the uniformity of our species. The AI soothsayers often like to make wild, exaggerated statements to draw attention to themselves. I suspect the reason is partly that they’re not especially bright, and they feel that aligning themselves with The Machine will fortify them with a veneer of mathematical brilliance and steely-eyed realism. But in reality, these people are falling for a myth of historical unidimensionality rooted in the enlightenment. This myth says that there’s one Humanity that undergoes one History, and this History subtends all things that have ever happened, ever. In reality, the idea of one single “History” is a fiction, and the advent of electronic media is continuing to bear this out.

Humanity is not like a train, and each of its epochs are not like tunnels through which it passes. We don’t simply go through one and then leave it behind altogether, moving into the next, preparing to leave that behind, too. Rather, when our species develops, it keeps within it all that happened before. When “literacy” was invented, it did not put an end to oral storytelling. When electronic media came along, “the age of the book” supposedly ended, yet somehow we still have plenty of books. It seems that with a wider variety of media forms, you see a fragmentation of how various populations or groups will engage with them, but rarely do you see outright displacement. Even I find it a bit ridiculous that musicians still put out releases on tape cassette, but nonetheless, it happens. Man, in his ingenuity, continues to find novel uses for old forms.

One likely situation is that purely human-made content will take on a new importance that it didn’t have before. The Frankfurt Schooler Walter Benjamin, back in 1936, pointed out that mechanical reproduction jeopardizes a work’s aura of authenticity, since no copy can be perceived as authentic despite looking the same. Yet at the same time, only when the mass reproduction of symbols is possible does authenticity emerge as a condition to be prized. If a work is proven to be the original item, then you’ll likely see a higher perception of its value. The same can be said of other non-artistic uses of AI. Consider, for instance, sex-bots. Without a doubt, various proles and undesirables will be jerking their seed into these sex-bots and weeding themselves out of the gene pool forever. They’ll also probably be pretty miserable. But let’s just pretend for a moment that these sex-bots give them everything they want from a real woman, even looking convincing enough. This incredible traversing of the uncanny valley nevertheless will only give the possession of a flesh-and-blood woman a greater sense of prestige among men. That may seem counterintuitive, but social hierarchy dictates that even when you have a condition of abundance, men will reinterpret it as a condition of scarcity and then compete for the most prized item within that new reassessment of value. So it goes for art.

This means, if I’m right, that we’ll start to seek out signs of human authenticity in art that we didn’t care as much about before. Artists will likely ask themselves what computers can’t do very well, and then go do that. Even “mistakes” might be prized if they’re inimitably human mistakes. Back when I studied medieval Latin, I learned that when you study local documents in old rustic village towns, you could often determine a document’s authenticity because of its mistakes. The likelihood that some rustic landed nobleman or friar would know Latin perfectly was quite low, so a perfectly written document was often more of a red flag than one with various grammar mistakes and idiosyncratic spellings.

Authenticity, unfortunately, is a complex subject, and I’ve written enough about AI for today. So I’ll return to it next week when I discuss two more predictions I have, both original, on AI-generated art. They will pertain to aesthetic philosophy and the role of the artist in society.

Steam Calliope Scherzos

Discussion about this post