New open-source AI tool enables much longer, more consistent video generation

If you have used video generation models, there is one thing you will find consistent across the board — they are limited to short clips, usually between 5 and 20 seconds. The reason for this limitation exists is because of something called "drift." Drift causes scenes and characters to increasingly lose their feature frame by frame, resulting in an incoherent output over time.
Now to tackle this issue, researchers at EPFL’s Visual Intelligence for Transportation (VITA) laboratory have developed a novel training method called "retraining by error recycling." Instead of discarding the glitches and deformities that naturally occur during generation, this approach intentionally feeds them back into the model.
Prof. Alexandre Alahi compares the process to "training a pilot in turbulent weather rather than in a clear blue sky." By learning from its own mistakes, the AI becomes robust enough to stabilize itself when errors inevitably appear, rather than spiraling into randomness.
This method powers the new Stable Video Infinity (SVI) system. Unlike current models that often crumble after 30 seconds, SVI can generate coherent, high-quality videos lasting several minutes or longer. The system is already making waves in the tech community; its open-source code on GitHub has garnered over 2,000 stars, and the research has been accepted for presentation at the 2026 International Conference on Learning Representations (ICLR).
The team is also debuting LayerSync, a companion method that allows the AI to correct its internal logic across video, image, and sound generation. Together, these tools promise to engineer better autonomous systems and unlock the potential for truly long-form generative media.
Source(s)
SVI via Tech Xplore









