New Google AI Creates Audio From Video & Prompts

X

Google’s Deep Mind has showcased its latest results from its generative AI video-to-audio research. The system combines what is seen on screen with the user’s written prompt to create synced audio.

Called V2A AI, it can be paired with video-generation models such as Veo. It can create soundtracks, sound effects, and dialogue for on-screen action.

Deep Mind also claims it can generate “an unlimited number of soundtracks for any video input” by tuning the model with positive and negative prompts.

It works by encoding and compressing the video input, then leverages it to iteratively refine the desired audio effects from background noise, based on the user’s text prompt and the visual input.

The audio output is then decoded and exported as a waveform which can be recombined with the video input.

Screenshot 2024 06 20 111441 New Google AI Creates Audio From Video & Prompts

The user isn’t required to go in and manually sync the audio and video tracks, because the system does it automatically.

The Deep Mind team said, “By training on video, audio and the additional annotations, our technology learns to associate specific audio events with various visual scenes while responding to the information provided in the annotations or transcripts.”

The system isn’t entirely flaw-free yet. One; the output audio quality is dependent on the fidelity of the video input, and two; the system can mess up when video artifacts or distortions are present.

Deep Mind revealed syncing dialogue to the audio track is still a challenge as well.

“V2A attempts to generate speech from the input transcripts and synchronize it with characters’ lip movements. But the paired video-generation model may not be conditioned on transcripts. This creates a mismatch, often resulting in uncanny lip-syncing, as the video model doesn’t generate mouth movements that match the transcript.”

The team also revealed the system still has to undergo “rigorous safety assessments and testing” before it’s released to the public.

Stability AI also released a similar product last week, and ELevenLabs released its sound effects tool last month.

%name New Google AI Creates Audio From Video & Prompts

Bromley 450 728x90 1 New Google AI Creates Audio From Video & Prompts
251120 SAV leaderboard New Google AI Creates Audio From Video & Prompts
1 4 Square Media 728 x 90 New Google AI Creates Audio From Video & Prompts
LEADERBOARD 728x90 1x New Google AI Creates Audio From Video & Prompts
CN 728 x 90 New Google AI Creates Audio From Video & Prompts
728x90px 1 New Google AI Creates Audio From Video & Prompts
728 x 90 New Google AI Creates Audio From Video & Prompts
HAIER Series 600 FLW HPD Pairs 728x90 1 New Google AI Creates Audio From Video & Prompts
HAR1188 TechMedia CH6 728x90 New Google AI Creates Audio From Video & Prompts
Tefal 728x90 New Google AI Creates Audio From Video & Prompts
728x90 New Google AI Creates Audio From Video & Prompts
FX9 728x90 1 New Google AI Creates Audio From Video & Prompts
Arlo YFAN DOAgency Digital Retail 728x90 1 New Google AI Creates Audio From Video & Prompts
OP 2 728x90 1 New Google AI Creates Audio From Video & Prompts
Uniden March 2026 728x90 1 New Google AI Creates Audio From Video & Prompts
PAN1842 MW 728X90 New Google AI Creates Audio From Video & Prompts
GOTHAM 728px x 90px New Google AI Creates Audio From Video & Prompts
4 New Google AI Creates Audio From Video & Prompts
0906 SMARTHouse 728x90px FA New Google AI Creates Audio From Video & Prompts
728x90 New Google AI Creates Audio From Video & Prompts
AEG Oven Display 729 x 90 px New Google AI Creates Audio From Video & Prompts
4Square 728x90 1 New Google AI Creates Audio From Video & Prompts
BlueAnt 4SQM PumpAirUltra 728x90px New Google AI Creates Audio From Video & Prompts
Denon Home Channel News 728x90 1 New Google AI Creates Audio From Video & Prompts
728x90 New Google AI Creates Audio From Video & Prompts
hitachi mij refrigerator 728x90 New Google AI Creates Audio From Video & Prompts
Hand Mixer 728 x 90 px New Google AI Creates Audio From Video & Prompts
Olimpia Splendid Unico Cooling 728x90 1 New Google AI Creates Audio From Video & Prompts
3005 25 Q4 AspireAI 728x90 1 New Google AI Creates Audio From Video & Prompts

YOU MAY ALSO LIKE