Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cross-language voice conversion method and system based on unentanglement and explanatory representation

A speech conversion and cross-language technology, applied in speech analysis, speech recognition, speech synthesis, etc., to achieve the effect of improving personality similarity, improving accuracy and versatility, and improving versatility and practicability

Active Publication Date: 2020-10-16
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Purpose of the invention: In order to overcome the deficiencies of the prior art, the present invention provides a cross-lingual speech conversion method based on disentanglement and explanatory representation, which can solve the problem that the existing speech conversion technology can only convert in the same language problem, on the other hand, the present invention also provides a cross-lingual speech conversion system based on disentanglement and interpretative representation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-language voice conversion method and system based on unentanglement and explanatory representation
  • Cross-language voice conversion method and system based on unentanglement and explanatory representation
  • Cross-language voice conversion method and system based on unentanglement and explanatory representation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0065] The present invention proposes a cross-lingual speech conversion method based on disentanglement and explanatory characterization, including a training phase and a conversion phase. The training phase is used to obtain the parameters required for voice conversion and the conversion network, while the conversion part is used to realize the voice conversion of the source speaker voice of the target speaker.

[0066] Such as figure 1 As sh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cross-language voice conversion method and system based on unentanglement and explanatory representation. The method comprises a training stage and a conversion stage, wherein the training stage comprises the following steps: obtaining training corpora, wherein the training corpora are composed of corpora of a plurality of speakers of two languages, the speakers comprising a source speaker and a target speaker; extracting Mel spectrum features in the training corpus to obtain an acoustic feature vector; and inputting the acoustic feature vector into a conversion network for training, wherein the training network comprises a content encoder, a speaker encoder and a decoder. According to the invention, through unentanglement and explanatory representation, the content information in the statement of a speaker and the individual information of the speaker are decoupled, and then, the source speaker content information and the target speaker personality information are reconstructed, so that high-quality cross-language voice conversion is achieved; and voices of speakers which are not in a training set can be converted, so that the problem that target speakertraining corpora are difficult to obtain is solved, and the application range of the method is expanded.

Description

technical field [0001] The invention relates to the technical field of speech conversion, in particular to a method and system for cross-lingual speech conversion based on disentanglement and explanatory representation. Background technique [0002] Speech conversion is an important research branch in the field of speech signal processing, which is developed and extended on the basis of speech synthesis and speaker recognition. The task of voice conversion is to change the speech personality of the source speaker to make it have the personality of the target speaker, while keeping the semantic information of the source speaker unchanged. In short, after the source speaker's voice is transformed, it retains the original semantics and sounds like the target speaker's voice. [0003] After years of research on speech conversion technology, many classic conversion methods have emerged. According to the classification of training corpus, speech conversion can be divided into con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/07G10L15/08G10L15/02G10L13/02G10L25/18
CPCG10L15/063G10L15/07G10L15/08G10L15/02G10L13/02G10L25/18
Inventor 李燕萍徐玲俐
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products