Microsoft VASA technology can create realistic deepfakes using one photo and one audio track

Through the viewing window: Microsoft Research Asia has released a white paper on generative AI applications in development. The program is called his VASA-1 and allows him to create highly realistic videos from just a single image of his face and a vocal soundtrack. What’s even more impressive is that the software can generate videos and swap faces in real time.

Visual Affective Skills Animator (VASA) is a machine learning framework that analyzes photos of faces, animates them into speech, and synchronizes lip and mouth movements to speech. It also simulates facial expressions, head movements, and even invisible body movements.

Like all generative AI, it’s not perfect. Machines still have problems with small areas like fingers and, in the case of VASA, teeth. If you pay close attention to your avatar’s teeth, you’ll notice that they change in size and shape, giving them accordion-like properties. This is relatively subtle and appears to fluctuate depending on the amount of movement going on within the animation.

There are also some manners that don’t seem quite right. It’s difficult to express them in words. It’s more like your brain realizes that something is slightly off from the speakers. However, it is only noticeable if you look closely. To the casual observer, the face may appear to be a recorded human speaking.

The faces used in the researchers’ demos were also AI-generated using StyleGAN2 or DALL-E-3. However, this system works with any image, real or generated. You can also animate the faces you paint or draw. Mona Lisa’s face when Anne Hathaway sings “Paparazzi” in Conan O’Brien is hilarious.

All kidding aside, there are legitimate concerns that malicious actors could use this technology to spread propaganda or try to trick people by impersonating family members. Given that many social media users post family photos on their accounts, it would be easy for someone to scrape images and imitate that family member. It could also be combined with voice cloning technology to make it even more convincing.

Microsoft’s research team acknowledges the possibility of an exploit, but does not provide a proper answer to counter the exploit other than careful video analysis. It points out the aforementioned artifacts while ignoring ongoing research and continuous system improvements. The team’s only concrete effort to prevent abuse is to keep it out of the public domain.

The researchers said they “do not plan to release any online demos, APIs, products, additional implementation details, or related products until we are certain that the technology is used responsibly and in accordance with appropriate regulations.” Ta.

However, this technology has some interesting practical applications. One is to use VASA to create realistic video avatars that render locally in real time, eliminating the need for bandwidth-hungry video feeds. Apple already does something similar to this with the Spatial personas available in the Vision Pro.

Check out the technical details in the white paper published in the arXiv repository. There are also more demos on Microsoft’s website.

Source link

What's Hot

AI technology takes marine navigation to a new level

Google’s parent company is still thriving as it shifts to inject more AI technology into search

Zuckerberg opposes China’s blockade of AI technology

Microsoft VASA technology can create realistic deepfakes using one photo and one audio track

Microsoft invests $1.5 billion in UAE’s artificial intelligence G42

Why is Windows 11 so frustrating?

The week ahead on Wall Street: Focus on US GDP data, Microsoft, Alphabet, Meta Platforms and Tesla earnings

Former White House cyber policy director says Microsoft is a national security threat • The Register

Microsoft insider sells $89 million in stock, hints at hesitation

If Satya Nadella had invested $10,000 in Microsoft stock when he became CEO, this is how much he would have made today.

How Amazon Prime’s ‘Fallout’ series highlights the power of post-apocalyptic video game IP

Popular household items are on sale at Amazon with up to 77% off

In the Amazon, butterflies play a key role in the fight against climate change

CeraVe Skin Care and Breezy Blouse available on Amazon starting at $7

Why Apple is betting big on India

Security Bite: Cybercriminals take advantage of Apple Store Online third-party pickup

Protecting against iPhone password reset attacks: How-to

Apple just canceled major products, reports say

Our Picks

AI technology takes marine navigation to a new level

Google’s parent company is still thriving as it shifts to inject more AI technology into search

Zuckerberg opposes China’s blockade of AI technology

Subscribe to Updates

What's Hot

Microsoft VASA technology can create realistic deepfakes using one photo and one audio track

Related Posts

Subscribe to Updates