Gemini Omni Flash generates a 10-second video from a London Eye prompt that accurately depicts the capsule interior, River Thames, Big Ben, and gentle wheel sway
Output closely matches actual footage from the same location.
Gemini Omni is so fun - insanely great at editing videos!
@fofrAI This looks exactly like my video from there!
Gemini Omni Flash: > a recording from a capsule on the london eye, a jerky zoom into something in the distance and then refocusing (with a bit of back and forth) (no timestamp or dialog) Note the world knowledge of London’s landscape, and the way the video is gently moving like the capsules do.
@fofrAI yeah it's crazy
Editing videos is where Gemini Omni Flash really shines. It is so incredibly capable. > Make it New Year's Eve with fireworks. Update the clock London launched the fireworks early.
@demishassabis bro is building a simulation inside the simulation
Gemini Omni is a major leap in world understanding & multimodal editing! It can take photos, video & audio and build entirely new scenes. Over time it’ll be able to handle any input & any output - starting w/ video You can even give it your own videos & iterate on your ideas:
> Remove all the buildings
Gemini Omni Flash: > a recording from a capsule on the london eye, a jerky zoom into something in the distance and then refocusing (with a bit of back and forth) (no timestamp or dialog) Note the world knowledge of London’s landscape, and the way the video is gently moving like the capsules do.
Editing videos is where Gemini Omni Flash really shines. It is so incredibly capable.
> Make it New Year's Eve with fireworks. Update the clock
London launched the fireworks early.
Gemini Omni Flash: > a recording from a capsule on the london eye, a jerky zoom into something in the distance and then refocusing (with a bit of back and forth) (no timestamp or dialog) Note the world knowledge of London’s landscape, and the way the video is gently moving like the capsules do.