DeepMind, New AI Tool Generates Audio for Videos

DeepMind: New AI Generates Soundtracks and Dialogue for Videos

DeepMind, a Google AI research laboratory based in the UK, shares progress on their video-to-audio (V2A) technology which makes synchronized audiovisual generation...

Written by Yibeni Tungoe Jun 21, 2024 · 1 min read

DeepMind: New AI Generates Soundtracks and Dialogue for Videos

Image Credits: unsplash

DeepMind, a Google AI research laboratory based in the UK, shares progress on their video-to-audio (V2A) technology which makes synchronized audiovisual generation possible.

DeepMind’s V2A utilizes the description of a soundtrack which is paired with a video to create an unlimited array of music, sound effects and dialogue. The generated audio matches the characters and tone of the video provided.

Users are given more control over V2A’s audio output. A ‘positive prompt’ guides the generated output towards desired sounds, whereas a ‘negative prompt’ guides it away from undesired sounds.

However, DeepMind does not plan on releasing V2A until it undergoes rigorous safety assessments and testing. DeepMind has stated they are committed to developing and deploying AI technologies responsibly. They are focused on making sure their V2A technology has a positive impact on the creative community. Furthermore, DeepMind is gathering valuable feedback from various creatives to inform their ongoing research and development.

How Does DeepMind Work?

DeepMind’s V2A system encodes video input into a compressed representation. The diffusion model then refines audio from random noise. Visual input and natural language prompts are then given to generate the desired audio. Once the audio output is decoded, it turns into an audio waveform and is combined with the video data.

More information is added to the training process to generate higher-quality audio which also guides the model into generating more accurate and specific sounds. V2A trains on video, audio and additional annotations resulting in the technology associating specific audio events with various visual scenes.

Google’s Bard AI Now Generates Images.

Tags:

AI AI Generation Apps Audio DeepMind Google Tech technology V2A video-to-audio

Yibeni Tungoe

Journalism & Mass Communication student at North Eastern Hill University.

Profile

DeepMind: New AI Generates Soundtracks and Dialogue for Videos

How Does DeepMind Work?

Like this:

Tags:

Yibeni Tungoe

Leave a Reply Cancel reply

Popular News Categories

Apps

Artificial Intelligence

Computing

Google

Popular News

Follow Us

DeepMind: New AI Generates Soundtracks and Dialogue for Videos

How Does DeepMind Work?

Share this:

Like this:

Tags:

Yibeni Tungoe

Leave a Reply Cancel reply

Popular News Categories

Apps

Artificial Intelligence

Computing

Google

Popular News

Follow Us