Google DeepMind releases Omni, a video generation model that maintains consistent characters across scenes and supports iterative edits with physics simulation and Gemini knowledge
Omni generates reusable avatars from short clips retaining face and voice.
what they feed us in the Gemini microkitchens!
By default Omni has a bit of a professional "look", but can also create "normie" videos with the right prompting.
@venturetwins @GoogleDeepMind @shlomifruchter @nbrichtova Longer videos are coming! ;) thanks for playing with Omni, keep the feedback coming
@GoogleDeepMind That's it for now! I'm sure I'll be back with more generations later 😅 Huge congrats to @shlomifruchter, @doomie, @nbrichtova + the team on the launch. Really excited to see what folks do with "Nano Banana for video."
Stop and eat the flowers
@venturetwins @GoogleDeepMind Can’t wait for more historical day in life content from this
@GoogleDeepMind 1) Avatars Record a clip of yourself, and your face + voice will be saved as a character that you can add to any video. This makes it so easy to put yourself in any clip, while changing your style or outfit as needed to adapt to the scene. These clips are all my avatar 🤯
Amazing work by Google once again 😍
@GoogleDeepMind 1) Avatars Record a clip of yourself, and your face + voice will be saved as a character that you can add to any video. This makes it so easy to put yourself in any clip, while changing your style or outfit as needed to adapt to the scene. These clips are all my avatar 🤯
So good @OfficialLoganK 😂

i don’t think you understand how insane omni is
@GoogleDeepMind 1) Avatars
Record a clip of yourself, and your face + voice will be saved as a character that you can add to any video.
This makes it so easy to put yourself in any clip, while changing your style or outfit as needed to adapt to the scene.
These clips are all my avatar 🤯
Omni from @GoogleDeepMind just dropped 👀 It's a big step forward in video generation when it comes to character consistency, world knowledge, and editing. I've been testing it for the last few days - and I'm excited to walk through some of the key features + my clips 👇
2) World knowledge
Omni is grounded in Gemini's world knowledge - which means that it just knows a LOT of things without you having to include it in the prompt.
For example, upload an image of where you're standing and ask for a history or to explain a topic (like healing an ACL tear).
@GoogleDeepMind 1) Avatars Record a clip of yourself, and your face + voice will be saved as a character that you can add to any video. This makes it so easy to put yourself in any clip, while changing your style or outfit as needed to adapt to the scene. These clips are all my avatar 🤯
@bilawalsidhu brother... where art though, me and @mreflow are missing a friend 😂
Nano Banana for video is here! Google has long touted that Gemini is natively multi-modal in & out -- but Omni is the first glimpse into the power of that paradigm applied to creation. Toss in a video and do multi-turn edits. Toss in audio and get reactive visuals. It's kinda like talking to a smart VFX artist who can pull on it's world knowledge to inform your edits. Pretty impressive results! Could see Omni do well wrapped in a bigger authoring tool.
6/ Omni is rolling out today to Google AI Plus, Pro, Ultra subscribers globally through Gemini app + Google Flow.
Also coming at no cost to YouTube Shorts + YouTube Create this week.
API access in the coming weeks.
5/ Think Nano Banana, but for video. Omni can edit conversationally: • change backgrounds • add cinematic zooms • modify action • add characters/objects • preserve character consistency • refine across turns without losing scene context
@shlomifruchter yeah it's all about the prompting
By default Omni has a bit of a professional "look", but can also create "normie" videos with the right prompting.
Okay I take it back on Omni, editing real videos with it is crazy
By default Omni has a bit of a professional "look", but can also create "normie" videos with the right prompting.
Prompt:
In a kitchen featuring light wood cabinets and a microwave displaying ‘12:39’, a bald man in his 60s wearing a green t-shirt slowly zooms into frame to hold up a green bottle labeled ‘AGI Pills’, pushing it toward the camera while saying, “Hi, do you feel sometimes tired and down? So in this case, I recommend you this product; this is AGI Pills”.
Describe the visuals in high detail, how people look like, what they wear, the background, objects in the background. Describe background noises in detail. Do not make it visually appealing, make it a low quality YouTube video. Take care to add details that make it look like an amateur raw video, with low-fi sound, no depth of field, handheld camera, no zoom in or out, average looking people. Keep it objective and literal, focusing entirely on spatial positioning, posture, and specific actions.
By default Omni has a bit of a professional "look", but can also create "normie" videos with the right prompting.
> Redo this video so that there is a new person every 1s (or 24 frames at 24fps). Rapid fire.
Here's an example of 3 edits of a video with Omni: 1. original 2 maker her invisible, put gloves on her 3. while she's talking, two men come and take away the framed picture 4. change her outfit
There's a new Release Notes podcast dropping soon, it'll be a really nice deep dive into Omni with the folks that helped make it.
But also, I had far too much fun using Omni to make this preview.
Here's an example of 3 edits of a video with Omni:
1. original 2 maker her invisible, put gloves on her 3. while she's talking, two men come and take away the framed picture 4. change her outfit