Microsoft has announced a new lip-syncing AI tool that turns static images of people’s faces into animated clips of people speaking and singing.
Not only can VASA-1 produce lip movements that are “exquisitely synchronized” with the audio, it can also capture a “wide range” of facial nuances and natural head movements that contribute to the perception of authenticity and vibrancy. Masu.
Microsoft has developed “holistic facial dynamics” and head motion generation models that work in the latent space of the face. The company says, “Overall, it significantly outperforms conventional methods.”
VASA is currently just a research demonstration and we have no plans to release a product or allow others to use the API. Essentially, Microsoft just wants to show off its lip-syncing model.
According to the company, VASA will accept requests such as where the character should look, the crop above the subject’s head, and the emotion during conversation (neutral, happy, angry, surprised, etc.).
Microsoft demonstrated VASA by using DALL-E 3 or StyleGAN2 to generate AI images of people, but real photos can also be used. For example, the president of the United States could be forced to say things they didn’t say, which could raise ethical questions about deepfakes and misinformation.
“Our research focuses on generating visual emotional skills for virtual AI avatars with the aim of positive applications,” Microsoft says on its VASA-1 research page.
“It is not intended to produce content that will be used to mislead or deceive. However, like other related content generation technologies, it can be misused to impersonate humans. .
“We oppose any activity that creates misleading or harmful content based on real people, and we are interested in applying our technology to advance counterfeit detection. have.
“Currently, videos produced in this way still contain discernible artifacts, and numerical analysis shows that there is still a gap in achieving real-world video reliability. ”
This is true, and the example Microsoft posted still has the feel of: uncanny valley About them. However, not everyone is so media savvy, and some believe that the VASA-1 video is real.