Google has released an enhanced version of Veo 3.1, its AI video generation model, introducing native vertical video support and 4K upscaling capabilities designed for mobile-first content creation. The update enables creators to generate videos in 9:16 aspect ratio directly from reference images without quality loss from cropping.
The “Ingredients to Video” feature now produces more expressive and dynamic video content from reference images, even with short prompts. Improvements include enhanced character consistency across scenes, allowing the same character to appear throughout multiple clips whilst maintaining visual identity as settings change.
Ricky Wong, Lead Product Manager at Google DeepMind, stated the update generates lively, dynamic clips that feel natural and engaging while supporting vertical video generation optimised for platforms including YouTube Shorts, Instagram and TikTok.
The update introduces state-of-the-art upscaling to 1080p and 4K resolution, with the sharper 1080p output designed for editing workflows and 4K targeting high-end productions requiring rich textures and clarity for large screens. Background and object consistency has also improved, enabling creators to reuse settings, objects and textures across multiple scenes.
Google has integrated the updates across multiple platforms. Consumer creators can access Veo 3.1 through YouTube Shorts, YouTube Create app and the Gemini app. Professional users can utilise Flow video editor, Gemini API, Vertex AI and Google Vids, with 1080p and 4K resolution options available on Flow, API and Vertex AI.
All videos generated by Veo include SynthID, Google’s imperceptible digital watermark embedded in content to indicate AI generation. The Gemini app now includes video verification capabilities, allowing users to upload videos and determine whether they were created using Google AI tools.
The platform has generated over 275 million videos since Flow launched in October 2025. The update builds on Veo 3.1’s October release, which introduced improved audio output and granular editing controls to the video generation system.