Veo 2 is a new AI video creation tool to rival OpenAI's Sora

Veo 2 is a new AI video creation tool to rival OpenAI's Sora

SHARE IT

17 December 2024

On Monday, Google's DeepMind division unveiled its second generation Veo video generation model, which can generate clips up to two minutes long and with resolutions up to 4K quality — six times the length and four times the resolution of Sora's 20-second/1080p clips.

Of course, those are Veo 2's theoretical maximums. The model is now only available on VideoFX, Google's experimental video creation platform, with clips limited to eight seconds and 720p resolution. VideoFX is also waitlisted, so not everyone can sample Veo 2, however the business has indicated that access will be expanded in the coming weeks.

Veo 2 is said to have a number of advantages over its predecessors, including a better understanding of physics (think better fluid dynamics and illumination/shadowing effects) and the ability to generate "clearer" video clips, which means that generated textures and images are sharper and less likely to blur when moving. The updated model also includes better camera capabilities, allowing the user to place the virtual camera lens more precisely than previously.

Veo 2 has not yet refined the video generating process, however it appears to hallucinate significantly less than competitors such as Sora, Kling, Movie Gen, and Gen 3 Alpha. “Coherence and consistency are areas for growth,” Collins said. “Veo can consistently adhere to a prompt for a couple minutes, but [it can’t] adhere to complex prompts over long horizons. Similarly, character consistency can be a challenge. There’s also room to improve in generating intricate details, fast and complex motions, and continuing to push the boundaries of realism.”

Google also announced changes to Imagen 3 on Monday, allowing the commercial picture generating model to produce "brighter, better-composed" results. The model, which is available on ImageFX, will also provide further descriptive ideas based on keywords in the user's prompt, with each keyword triggering a drop-down selection of relevant terms.

View them all