Google has announced a new artificially intelligent video model called Lumiere
Many existing AI video models struggle with consistency of movement, and even when they are able to capture natural gait, other elements become choppy or blend into the landscape
Lumiere takes a different approach to video generation Instead of combining individual frames, it creates the entire video in one process by simultaneously processing the placement of objects and their movements
While the preview clips look impressive, this is just a research project, so you can't try it yourself However, the underlying technology and approach to AI video could be integrated into future Google products, making them a major player in this field
Lumier works extensively with text-to-video and image-to-video, providing stylized generation from reference images to fine-tune exactly how elements in a video look Some of this has already been achieved in models from Runway and Pika Labs
The AI model is built on a spatio-temporal architecture, which sounds like something out of a science fiction movie, but in reality means that it considers all aspects of motion and position
In the generation process, the model considers the "spatial" aspects of the clip and when and how it moves, or the "temporal" component To create consistent motion, both aspects are done simultaneously in a single run
The researchers write in their preprint paper on this mode: "Our model learns to directly generate low-resolution video at full frame rate by processing at multiple spatiotemporal scales
When generative AI video first began to emerge, its main focus was on creating short video clips, but as the technology matured, other features began to emerge Runway offers the ability to highlight different regions of an image and animate them independently
The Google research team states that Lumier achieves "state-of-the-art text-to-video generation results" and "facilitates a wide range of content creation tasks and video editing applications"
The team also noted that Lumier's "text-to-video generation is a very powerful tool Not only can smoother motion be expected, the team says, but it can also animate certain areas of an image with relative ease and provide in-painting capabilities, such as changing the style of clothing or type of animal in a frame
Many of the research projects by companies such as Google, Microsoft, and Meta will not see the light of day in the preview stage However, the underlying technology has been incorporated into branded products
This is not even the first AI video tool by Google; there is a video version of the Imagen model that powers Google Cloud's AI image generation, and VideoPoet is a large-scale language model for zero-shot video generation
Video Poet also generates audio from video clips without requiring text as a guide According to Google, the Video Poet model can also generate videos of arbitrary length with strong object identity by continuously generating one-second extensions This is also not currently publicly available
The answer to the question of whether Lumière can be seen in the real world depends on how acceptable it is to researchers and whether it is worth Google's while to participate in the project like Imagen, uses Google Cloud may be largely reserved for third-party developers
Comments