Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Mouth action driving model training method and assembly based on ASR acoustic model

An acoustic model, action-driven technology, applied in the computer field, can solve the problems of high cost, large training tasks, and the model cannot cover various sounds and scenes, and achieves the effect of improving quality, reducing complexity and training cost.

Pending Publication Date: 2021-07-13
SHENZHEN ZHUIYI TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method directly uses audio to train the model. Because the audio itself carries noise, different audio and timbre differences, the model cannot cover various timbres and scenes.
If the model is trained with multiple timbres and scenes, the training task will be large and the cost will be high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mouth action driving model training method and assembly based on ASR acoustic model
  • Mouth action driving model training method and assembly based on ASR acoustic model
  • Mouth action driving model training method and assembly based on ASR acoustic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0073] At present, the deep learning method directly uses audio to train the model. Because the audio itself carries noise, different audio and timbre differences, the model cannot cover various timbres and scenes. If the model is trained with multiple timbres and scenes, the training task will be large and the cost will be high. To this end, this application provides a mouth movement-driven model training scheme based on the ASR acoustic model, which can reduce the complexity of tr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a mouth action driving model training method and assembly based on an ASR acoustic model. The ASR acoustic model is used for converting various complicated audio data into phoneme features capable of shielding timbre differences and noise differences; then the phoneme features serve as model training data; therefore, the mouth action driving model is obtained through training, the quality of training data is improved, the complexity and training cost of the training data are reduced, and meanwhile the universality of the mouth action driving model is not affected. Correspondingly, the invention provides a mouth action driving model training assembly based on the ASR acoustic model, and the technical effects are also achieved.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a mouth movement driving model training method and components based on an ASR acoustic model. Background technique [0002] In the fields of character image generation, human-like character action rendering in electronic animation, etc., in order to make the characters in the image more real and natural, it is very important to match the mouth movement and voice. How to complete the mapping from sound to mouth movement is key to solving this problem. [0003] Existing technologies can be preliminarily divided into rule-based methods and deep learning-based methods. [0004] The rule-based method uses a dictionary-like structure to record the correspondence between phonemes and mouth movements provided by linguists, and completes the mapping from sound to mouth movements by looking up tables. This method requires many human factors, and the cost of the expert databas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/46G06K9/62G10L13/02G10L13/04
CPCG10L13/02G10L13/04G06V40/165G06V40/171G06V20/40G06V10/44G06F18/253G06F18/214
Inventor 陈泷翔刘炫鹏王鑫宇刘云峰
Owner SHENZHEN ZHUIYI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products