AI & RoboticsNews

Researchers propose AI that improves the quality of any video

Increasingly, researchers are using AI to transform historical footage — like the Apollo 16 moon landing and 1895 Lumière Brothers film “Arrival of a Train at La Ciotat station” — into high-resolution, high-framerate videos that look as though they’ve been shot with modern equipment. It’s a boon for preservationists, and as an added bonus, the same techniques can be applied to footage for security screening, television production, filmmaking, and other such scenarios. In an effort to simplify the process, researchers at the University of Rochester, Northeastern University, and Purdue University recently proposed a framework that generates high-resolution slow-motion video from a low frame rate, low-resolution video. They say that their approach — Space-Time Video Super-Resolution (STVSR) — not only generates quantitatively and qualitatively better videos than existing methods, but that it’s three times faster than previous state-of-the-art AI models.

In some ways, it advances the work Nvidia published in 2018, which described an AI model that could apply slow motion to any video — regardless of the video’s framerate. And similar up-resolution techniques have been applied in the video game domain. Last year, fans of Final Fantasy used a $100 piece of software called A.I. Gigapixel to improve the resolution of Final Fantasy VII’s backdrops.

STVSR learns temporal interpolation (i.e., how to synthesize non-existent intermediate video frames in between original frames) and spatial super-resolution (how to reconstruct a high-resolution frame from the corresponding reference frame and its neighboring supporting frames) simultaneously. Moreover, thanks to a companion convolutional long short-term memory model, it’s able to leverage a video’s context with temporal alignment such that frames can be reconstructed from the aggregated features.

AI high-resolution slow-motion video

The researchers trained STVSR using a data set of over 7-frame 60,000 clips from Vimeo, with a separate evaluation corpus split into fast motion, medium motion, and slow-motion sets to measure performance under various conditions. In experiments, they found that STVSR obtained “significant” improvements on videos with fast motions, including those with “challenging” motions like basketball players quickly moving up a court. Moreover, it demonstrated an aptitude for reconstructing “visually appealing” frames with more accurate image structures and fewer blurring artifacts, while at the same time remaining up to four times smaller and at least two times faster than the baseline models.

“With such a one-stage design, our network can well explore intra-relatedness between temporal interpolation and spatial super-resolution in the task,” wrote the coauthors of a preprint paper describing the work. “It enforces our model to adaptively learn to leverage useful local and global temporal contexts for alleviating large motion issues. Extensive experiments show that our … framework is more effective yet efficient than existing … networks, and the proposed feature temporal interpolation network and deformable [model] are capable of handling very challenging fast motion videos.”

The researchers intend to release the source code this summer.


Author: Kyle Wiggers.
Source: Venturebeat

Related posts
AI & RoboticsNews

The show’s not over: 2024 sees big boost to AI investment

AI & RoboticsNews

AI on your smartphone? Hugging Face’s SmolLM2 brings powerful models to the palm of your hand

AI & RoboticsNews

Why multi-agent AI tackles complexities LLMs can’t

DefenseNews

US Army buys long-flying solar drones to watch over Pacific units

Sign up for our Newsletter and
stay informed!