1.
Artificial intelligence is pivoting to video.
OpenAI’s new app, called Sora, turns text prompts into short clips. The videos the company has shown off look impressive at times.
But what’s most astounding about this video is how bland the text prompts are.
“A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow.”
“A young man in his 20s is sitting on a piece of cloud in the sky, reading a book.”
This is the most advanced tool of its kind. It can purportedly turn any idea into reality. It has the potential to visualize concepts that are beyond our imagination or understanding. And with access to this nearly unlimited power, Sora’s operators decide to give us “A cartoon kangaroo disco dances.”
The most imaginative prompts, like “Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee” are fridge poetry-level word salad. Godard found the universe in a cup of coffee. Sora floats fake boats in it.
2.
One of the most frustrating and least believable claims about artificial intelligence is that it will “unlock creativity.” It does this by effectively doing all the hard work of creating. That is, it does everything except come up with an idea. That, presumably, is the creative part, what an artist provides. But an idea is not art. And the prompts used for Sora so far make this clear.
There have been tools that make the creation of art more affordable and accessible—like digital video or home four-track recorders. They don’t make art any easier, though. They simply solve one problem—the high barrier of entry. This is a shortcut to the myriad problems that follow, which can generally be framed as questions that begin with the word “how“—how to tell a story, how to frame a shot, how to structure a sentence, how to make a rhyme.
To answer these questions, to solve the problems they stem from, the musician with a four-track or the aspiring filmmaker with a DV camera had to develop a vision, a voice, and a technique. They had to turn an idea into an actual work. They had to doubt themselves and do it over and over.
They became artists.
3.
The immediate commercial application of AI video (apart from scams) is to fill in gaps. Can’t afford to fly to Maine for a shot of a lighthouse? Need footage of a smiling elderly couple holding hands for your pharmaceutical ad? AI is here to help. Marques Brownlee makes the point that some of the Sora clips could already replace stock video, which would effectively destroy the stock footage industry. Adobe’s latest video editing software will soon use AI to plug holes in video timelines.
The use of AI for replacing stock video application seems less like it’s unlocking creativity and more like it’s blazing shortcuts through the creative process. If you have gaps in a piece you’re working on, creativity is the way to fill them.
4.
Generative AI video might start with stock footage, but it won’t stop there. This month, the musician Washed Out released a music video for the song “The Hardest Part,” directed by the artist Paul Trillo and made in collaboration with OpenAI. The video is entirely generated by Sora.
Trillo says he had the idea for this video ten years ago, but didn’t have the resources to make it real until Sora came long. It’s odd this is the result of such a long period of thinking and wishing. If not for the fact it was made with AI, it would be entirely forgettable. The only element that seems unique to AI are the mistakes.
Conceptually, it reminded me a bit of the video for Massive Attack’s “Protection,” directed by Michel Gondry.
Music videos are made to be watched over and over. Both “The Hardest Part” and “Protection” tell stories that get clearer over time, but “Protection” rewards rewatching in other ways. Gondry pushes the limits of what a camera can do. You want to watch the video again and again to figure out how it works. “The Hardest Part,” falls apart with repeat viewing. Paying attention to the clip just makes the flaws clearer—the odd gravity, the misshapen bodies, the computer hallucinations. “Protection” makes us ask “how did they do that?” The pressing question with “The Hardest Part” is “why is it like that?”
Trillo says of his video:
I wasn’t interested in capturing realism but something that felt hyperreal. The fluid blending and merging of different scenes feels more akin to how we move through dreams and the murkiness of memories. While some people feel this may be supplanting how things are made, I see this as supplementing ideas that could never have been made otherwise. Many artists in this industry are constantly compromising and negotiating their ideas with the reality of what can be made. This offers a glimpse at a future where music artists will be given the opportunity to dream bigger. An overreliance on this technique may become a crutch and it’s important that we don't use this as the new standard of creation but another technique in the toolbelt.
Trillo has worked with AI before. His video Thank You for Not Answering uses the tech’s glitchy visuals to recreate the feeling of a slipping memory. It’s inspired by the work of Wim Wenders. Wenders is an interesting comparison. His short film Two or Three Things I Know About Edward Hopper deserves several of the adjectives Trillo uses for “The Hardest Part”—it’s hyperreal, dreamlike, and fluid. Wenders brings Hopper paintings to life by making the real world look like a painting. It’s carefully made. There is no compromise in the film, just creative solutions to seemingly impossible challenges.
5.
Trillo’s statement overlooks the clearest and most dire implications of using AI. The technology is replacing people. There is no cast or crew. To treat these artists and craftspeople as line items on a budget—as a hindrance to making art that can only be overcome by technology designed to replace them—is to side with corporations over creatives.
It’s hubris to think writers and directors won’t be next in line for mechanical redundancy.
Not long after the Washed Out video landed online, Apple released an ad for its new AI-ready iPads. In the spot, musical instruments, sculptures, and other implements of creation are crushed in a hydraulic press. (After the public outcry, the company apologized for the ad.)
The same week, news broke that a shortage of work has led the Art Directors Guild to suspend a training program for aspiring art directors.
Before I could watch the Washed Out video on Youtube, I had to sit through an ad for Designrr, a company that promises to, in a few minutes, create a book from an idea, including all the text.
6.
The strangest Sora prompt OpenAI has shared is: “Tour of an art gallery with many beautiful works of art in different styles.” There’s no mention of a specific work of art. No style. No idea of what makes any of the works beautiful. The paintings in the video are not beautiful.
Other prompts are very specific about location. They mention Tokyo, Burano, Italy, and the Glenfinnan Viaduct in Scotland.
These are real places. The resulting AI videos are a misrepresentation—the hills behind the viaduct aren’t right. The buildings in Tokyo and Burano doesn’t exist.
Footage of all these places exists.
There are many “beautiful works of art” that exist in reality, too.
There’s no shortage of ideas in the world. There’s no shortage of AI-produced text, photos, and videos, either. Already, the internet—the primary medium for the transmission of culture—is so loaded down with artificial slop that core pieces of infrastructure no longer work. Search is a mess. Information is (even more) dubious. Scams are everywhere. None of this is art. None of it is creative, except in how it tries to separate us from something—from our money, from reality, from each other.
7.
In his video about Sora, Brownlee makes the point that AI technology has come a long way in the last few years. Even Trillo’s AI short film from last year looks dated. AI generated videos are more realistic. The technology is constantly getting better at looking real.
“This is the worst this technology is going to be from here on out,” Brownlee says.
Brownlee is right. This is the worst the technology is going to be.
But the effects aren’t the worst we’ll ever see.