Google AI Introduces Frame Interpolation for Large Motion (FILM): A New Neural Network Architecture for Creating High-Quality Slow-Motion Videos from Near-Duplicate Photos

Google AI Introduces Frame Interpolation for Large Motion (FILM): A New Neural Network Architecture for Creating High-Quality Slow-Motion Videos from Near-Duplicate Photos

Many studies increasingly focus on frame interpolation, which synthesizes intermediate frames between a pair of input frames. The refresh rate can be increased or slow motion videos can be created using temporal upsampling.

A new app has appeared recently. Due to the ease with which digital photography is made, individuals will often take multiple shots in quick succession to find the best one, as they can now produce multiple images in seconds. The interpolation between these “quasi-doubles” reveals the movement of the scene (and of certain cameras), often offering a more attractive sense of the event than any original photograph and showing interesting potential. However, conventional interpolation approaches present a significant hurdle when dealing with still images, as the time interval between quasi-doubles can be a second or more, with commensurately large scene motion. .

Recent approaches have shown promising results for the difficult problem of frame interpolation between consecutive video frames, which often exhibit minor motion. However, tweening for large scene motion, which usually occurs in close duplicates, has received little attention. Although the study attempted to address the large motion problem by training on a very extreme motion data set, its performance on small motion tests was disappointing.

A recent study by Google and the University of Washington proposes the FILM (Frame Interpolation for Large Motion) algorithm for the interpolation of large motion pictures, focusing on the interpolation of nearly duplicate frames. FILM is a simple, unified, one-step model that can be trained with standard frameworks only and does not require the use of prior optical flow or depth gratings or their limited pre-training data. It includes a “scale independent” bidirectional motion estimator that can learn from normal motion images but still generalizes well to high motion images and a “feature pyramid” that distributes importance across scales. They modify a weight-shared multi-scale feature extractor and present a scale-insensitive bi-directional motion estimator that can efficiently handle small and large motions using only standard training frames.

Based on the assumption that fine-grained motion should be analogous to coarse-grained motion, the method increases the number of pixels (since finer scale corresponds to higher resolution) accessible for large motion supervision.

The researchers noticed that interpolated images often looked shaky when state-of-the-art algorithms performed well on reference points, especially in large unoccluded regions resulting from large camera movements. To address this issue, they optimize their models using Gram matrix loss, which is consistent with high-level VGG feature autocorrelation and produces striking improvements in image sharpness and realism. .

In addition to relying on limited data for additional pre-training optical flow, depth, or other previous gratings, the training complexity of modern interpolation techniques is a significant limitation. The lack of information is particularly problematic for major changes. This study also contributes to a uniform architecture for frame interpolation that can be trained using only standard frame triplets, which greatly simplifies the training procedure.

Numerous experimental results demonstrate that FILM provides high-quality, temporally smooth video, outperforming competing approaches for large and small movements.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'FILM: Frame Interpolation for Large Motion'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, project, github link and reference article.

Please Don't Forget To Join Our ML Subreddit

Asif Razzaq is an AI journalist and co-founder of Marktechpost, LLC. He is a visionary, entrepreneur and engineer who aspires to use the power of artificial intelligence for good.

Asif’s latest venture is the development of an artificial intelligence media platform (Marktechpost) that will revolutionize the way people can find relevant news related to artificial intelligence, data science and technology. machine learning.

Asif was featured by Onalytica in its ‘Who’s Who in AI? (Influential Voices & Brands)’ as one of the ‘Influential Journalists in AI’ ( His interview was also featured by Onalytica (

#Google #Introduces #Frame #Interpolation #Large #Motion #FILM #Neural #Network #Architecture #Creating #HighQuality #SlowMotion #Videos #NearDuplicate #Photos

Leave a Comment

Your email address will not be published. Required fields are marked *