ChatGPT and other text-based AI tools have transformed the way we interact with technology, making it easier than ever to generate ideas, automate tasks, and enhance productivity. But AI is no longer just about text. The evolution of multimodal AI—capable of understanding and generating across different formats—has opened new frontiers.

Not long ago, creating a realistic-looking video required millions of dollars (or crores of rupees). All it takes now are text commands (prompts) on a text to video generator to create videos that would make a casual observer (or even serious ones) doubt if the video they are looking at is real. What once required expensive equipment and a team of professionals can now be achieved with a few lines of text. Do not be surprised if in the next few years, the only skill that will be valuable is creativity. For everything else, there will always be an AI editor.
There are a whole host of AI-driven video editors out there now, but the latest and arguably the most advanced are Open AI’s Sora Turbo & Google DeepMind’s Veo 2. Since their debut in Dec 2024, there has been intense digital chatter about various aspects of both and which is better. We discuss a comparison of both in this article:
Quality:
The first thing that really stands out is that Veo 2 can natively generate 4k resolution videos while Sora Turbo can only go up to 1080p and will need you to upload the output to a video enhancer to improve it.
Video output:
Veo 2’s videos look more realistic and seem to emphasize accuracy. Google also seems to have given importance to prompt adherence (i.e.) how accurately the AI interprets the prompt and executes instructions.
Sora Turbo allows creation of twenty second videos (widescreen, vertical or square aspect ratios) while Veo 2 can generate videos up to a few minutes.
Sora has also unveiled a new dashboard & an interface that has made it easier to prompt Sora with text, images, and videos. Sora also has a storyboard tool with templates for users to experiment with and also define and edit frame by frame.
In Google’s own words, Veo 2 is capable of understanding simple to complex instructions, has superior understanding of real-world physics, offers enhanced realism and fidelity. Veo 2 can even understand cinematic jargon like depth of field, tracking shot, close-up shot etc. It is able to simulate visuals of specific lenses as well.

Google says Veo 2 “hallucinates” less often than other competing video generation models and this makes the outputs more realistic.
According to Google, Veo 2 is preferred over other text to video generators like Meta Movie Gen, Kling v1.5, Minimax and Sora Turbo based on ratings by humans.
Considering Alphabet is both Youtube’s and Veo 2’s parent company, the superior quality output from Veo 2 should not really be surprising to anyone in hindsight.
Availability & cost:
Sora is available to anyone who has a ChatGPT Plus account at no additional cost. You are allowed a limit of fifty videos at 480p resolution or fewer videos at 720p each month. Sora pro plan is available at an additional cost of $200/month. This is available in India.
Veo 2 is available for a preview in select markets through Google Lab’s VideoFX (VideoFX is not currently available in India). New users can sign up for the waitlist on Google labs.
Target audience:
It looks like Google is targeting filmmakers, enterprises and the like offering professional quality videos. While OpenAI’s videos could be catering to the general public, individual creators etc.
Safety concerns:
There are serious concerns about misinformation, misattribution, deepfake videos etc. and hence both Veo 2 & Sora are being measured with their roll-outs and they are currently accessible only by some sections of the population. Sora is not available to anyone under eighteen. Both have also made the generated outputs traceable to the source. Sora-generated videos will come with C2PA metadata and will have visible watermarks by default. Google Veo 2’s videos are watermarked using SynthID which will help identify AI generated videos. Google also has inbuilt filters to check for privacy, copyright and bias risks.
The Future:
Google has announced plans to expand Veo 2 to YouTube Shorts in 2025, along with improvements in handling complex scenes and motion for greater consistency. As AI video editing technology continues to advance, industry experts anticipate significant innovations. With ongoing developments from both OpenAI and Google, the future of AI-driven video editing appears increasingly promising.
As AI expands its creative capabilities, the question is no longer whether AI can assist in content creation—it’s how far it will go. The rise of AI-powered video editing signals a future where the only limit might be human imagination.
Authored by: Varun Krishnan