Google's Gemini Omni model generates first-person video of a taxi following a route from a marked Google Maps screenshot with consistent viewpoint and visual details
Model synthesized dynamic simulation from static map imagery.
Can't believe we're getting this before GTA 6
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.
Relatedly, please someone at Google work on a game based on this. If nobody does, it might be Google's second biggest missed opportunity of the decade.
Can't believe we're getting this before GTA 6
World Models ftw :)
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.
remarkable result
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.
this is cool
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.
ok im trying this tomorrow
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.
YouTube + Maps is one helluva data moat for Google. Case in point:
I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.