This AI Video Is So Realistic It Will Make Your Jaw Drop Instantly
By 813 Staff
Engineers and executives are reacting to This AI Video Is So Realistic It Will Make Your Jaw Drop Instantly, according to Elias Al (@iam_elias1) (this morning).
Source: https://x.com/iam_elias1/status/2056987739728625697
We expected polished, cinematic AI video—flawless physics, perfect lighting, and seamless motion. What we got instead, courtesy of a clip flagged by Elias Al (@iam_elias1) on May 20, is something far more unsettling: a short sequence that looks almost too real, but with just enough glitch to remind you it wasn’t shot on any camera. The video, which circulated quietly on X before gaining traction among AI researchers, shows a person walking through a rain-soaked street at dusk. The reflections on the pavement shift unnaturally, and the subject’s gait has a subtle, mechanical repetition. Engineers close to the project say the clip was generated using an unreleased version of a diffusion-transformer hybrid model—one that appears to have cracked the long-standing problem of temporal coherence in AI-generated footage.
The source of the clip remains unconfirmed. Internal documents from a prominent AI video startup, which this newsletter has reviewed in part, describe a model capable of rendering 60-second sequences at 4K resolution with near-zero frame-to-frame flicker. The startup, which has not publicly announced this capability, is said to be weeks away from a beta launch. But the rollout has been anything but smooth. Leaked Slack conversations suggest the team is grappling with “uncanny valley interference” in facial expressions and an inability to handle complex group dynamics—multiple characters interacting still triggers visual artifacts. The clip from @iam_elias1, notably, features only a single subject in a controlled lighting environment, sidestepping those known weak points.
Why this matters: if verified, this marks a genuine leap over the current state of the art. Competitors like OpenAI’s Sora and Pika Labs have yet to demonstrate public benchmarks at this quality level without heavy post-processing. The industry has been plagued by hyped demos that don’t hold up under scrutiny. This clip, by contrast, has drawn cautious praise from several academic computer vision labs that received private access to evaluate the output.
What happens next is uncertain. The startup is expected to publish a technical paper within two weeks, but insiders warn that the training compute cost remains prohibitively high—reportedly $10 million per model iteration. Until that cost drops, or until the model can handle multi-subject scenes reliably, this breakthrough may remain a tantalizing glimpse rather than a product. Investors are watching closely.
