I have been doing AI Art for almost 3 years. The last months I have made a number of short AI videos. This has given me some perspective on what is possible and not possible to do with AI Video. My conclusion thus far is that AI Video is not competing with normal movies. Rather it should be seen as its own art form. I will explain why here.
Every artistic medium has its own limitations, and the most enduring works often emerge not by overcoming these limitations, but by working creatively within them. Comics, poetry, novels, movies, and songs all have unique constraints, and each demands a different approach to storytelling. AI video is no exception. It is not a replacement for traditional film or animation, but a new form of expression—one that invites us to rethink how we construct and communicate narratives.
Take comics, for instance. Donald Duck’s nephews all look the same—identical ducklings distinguished only by the color of their hats. This wasn’t an oversight; it was a deliberate design solution to a drawing problem. Similarly, in Teenage Mutant Ninja Turtles, the characters are drawn with identical bodies and faces, but they are made distinguishable through their colored bandanas and weapons. These choices are not failures of detail, but practical solutions to the medium’s limits.
In manga and other serialized comics, artists frequently reuse poses, backgrounds, and even facial expressions to save time and maintain consistency. In Peanuts, Charles Schulz used minimal backgrounds so the reader would focus on dialogue and emotional nuance. In Asterix, each character is exaggerated into an easily recognizable archetype. These are not bugs in the system—they are features that emerge from understanding what the medium can and cannot do.
We see similar adaptation in film. Movies cannot directly show us the inner thoughts of characters the way novels can. So they use symbolic visuals, or dialogue as workarounds. A book might write:
Todd waved to his friend Josh and said “How are you doing man?”
“Awesome!” Josh exclaimed.
But in a movie this might get changed to:
Todd: “How are you doing Josh?”
Josh: “Awesome Todd!”
Why would you need to change the dialogue? Movies do not have narrators, so you need to convey everything to the viewer through visuals or dialogue. That means things like names of characters might have to be woven into the dialogue. Of course once the viewer is informed of the names of characters one doesn’t need to engage in these artificial changes to dialogue.
And dialogue is not the only way. One could introduce a character in an office by zooming in on their name tag. But whatever method a director chooses it is a way of deliberately working around the limitations of the medium in which they expresses their story.
Music videos often rely on montage and repetition because they only have a few minutes to tell a story or evoke a mood. Lyrics may be vague or metaphorical because the combination of imagery and music can do more together than text alone ever could.
AI video, too, has its own set of limitations—and strengths. One of the most obvious challenges is inconsistency: character faces change between shots, body shapes shift, backgrounds transform as scenes evolve. A character may be wearing a red jacket in one scene and a slightly different version in the next, even within the same setting.
Trying to force AI to replicate traditional film standards—with perfect visual continuity, realistic motion, and flawless scene composition—is a losing battle. Doing so defeats the entire point of using the medium. If one insists on perfection, they might as well go back to green screens and 3D rendering.
Instead, the key is to design stories that embrace the logic of AI video. Just like comics use consistent hairstyles and clothes to ensure character recognition, AI video creators should lock in distinctive visual markers. Hair color, style, signature clothing, and recognizable props can go a long way in maintaining continuity where faces cannot.
For example, if I were creating a scene with three fantasy characters appearing together, I would give each one a different hair color but keep their body types and clothing styles relatively similar. This isn’t a compromise in imagination, but a practical way to avoid AI rendering errors and mismatches. Matching unique outfits and varied body types to specific characters can quickly become unmanageable with current tools. Uniformity in shape and wardrobe, paired with distinct colors or accessories, makes consistency more achievable.
The fantastical worlds AI can create are unlike anything achievable through conventional filmmaking on a modest budget. You can conjure vast alien cities, surreal landscapes, impossible architecture—but what you gain in flexibility, you sacrifice in control. That is the trade-off. The worlds feel like dreams: vivid, expressive, and mutable. But like dreams, they are also inconsistent, unstable, and often disjointed.
This is why I suggest thinking of AI video as a kind of dream-logic medium. In dreams, details shift, faces blur, rooms change shape. But the emotional thread remains. We accept the strangeness because we recognize the deeper logic underneath. AI video works best when it is structured like a dream: symbolic, visually poetic, emotionally charged, and not overly burdened by realism.
If you’re struggling to make AI video “perfect,” you might be misunderstanding what it excels at. It is not a precision tool. It is a painter’s brush dipped in chaos. It is the imagination running slightly off-track. Accepting this frees you to tell stories that would be impossible through other means.
Let AI video be what it is: a new medium with new rules. Not a replacement for cinema, but something like comics, theater, or experimental poetry—a form with its own grammar, strengths, and limitations. The question is not whether AI can make a perfect movie, but whether you can make something people care about, remember, and feel.
And if you can, then you are already working within the frame.