Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Real-time Speech-Driven Facial Animation Method

A technology for driving face and animation, applied in the fields of image processing, voice visualization, voice processing, and face animation, it can solve problems such as inability to meet real-time performance and face animation dependence, and achieve the effect of accurate relationship and accurate conversion.

Inactive Publication Date: 2016-12-28
UNIV OF SCI & TECH OF CHINA
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the existing voice-driven face animation methods use Hidden Markov Model (Hidden Markov Model) to realize the conversion of voice parameters to visual parameters. This process requires the use of voice recognition technology to obtain the phoneme sequence corresponding to the voice signal, and synthesize The face animation of the company relies heavily on the results of speech recognition, and it cannot meet the real-time requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Real-time Speech-Driven Facial Animation Method
  • A Real-time Speech-Driven Facial Animation Method
  • A Real-time Speech-Driven Facial Animation Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0034] The invention is a method capable of synthesizing real-time voice-driven facial animation. The main steps are: obtaining voice parameters and their corresponding visual parameters, constructing a training data set; converting voice parameters into visual parameters for modeling and model training; constructing a set of blendshape corresponding to the face model; visual parameters to face animation parameters conversion, such as figure 1 shown.

[0035] 1. Obtain speech parameters and visual parameters, and construct a training data set

[0036] Have a performer read out a set of sentences, chosen to have good phoneme coverage. When reading aloud, the head posture remains unchanged, and the recording and video recording is performed right in front of the performer's face. After the recording and video recording is completed, the sound ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a real-time human face animation driving method by voice. The method is characterized in that (1) on the basis of real caught voice parameters and visual parameters, a method of combining a Gaussian mixed model and a Markovian model is adopted for realizing the visual parameter conversion from the voice parameters to the visual parameters; (2) the direct conversion from the voice parameters and the visual parameters is realized, the influence of the old visual characteristics on the current visual characteristics is considered, and the phoneme sequence provided by a voice identification system is not relied as the conversion premise; (3) the real-time requirement and the non-real-time requirement can be met; (4) the high-reality human face animation can be generated, and the human face animation with the anime effect can also be generated; and (5) the face expression can be controlled. The method has the advantages that the objective performance and the subjective interaction test provide the effectiveness of application in aspects of Internet face-to-face communication, virtual presenters, computer games and the like.

Description

technical field [0001] The invention relates to the technical fields of voice processing, image processing, voice visualization, and human face animation, in particular to a method capable of synthesizing real-time voice-driven human face animation. Background technique [0002] Facial animation has been used more and more in multimodal human-computer interaction, film production, computer games, video conferencing, virtual host, etc. The face animation method based on video drive has a good synthesis effect, but this method requires that when synthesizing animation, a specific device must be used to capture the movement of the face on a specific occasion, which is time-consuming and expensive, and cannot be used by ordinary users. The text-driven face animation method requires the help of a speech synthesis system, and the current synthesized speech still lacks the rhythm and emotion of natural speech. Therefore, using real speech to drive facial animation is one of the cu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06T13/40
Inventor 汪增福罗常伟於俊
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products