High-quality voice conversion method

A voice conversion, high-quality technology, applied in the field of high-quality voice conversion systems, can solve problems such as limiting the space for improving the effect of voice conversion, and achieve the effect of high-quality voice conversion

Active Publication Date: 2017-08-29
NANJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Scholar Daniel Erro combined GMM and FW technology, and the converted voice obtained achieved a good balance in terms of voice similarity and sound quality. However, Daniel Erro used GMM to perform soft classification training with fixed mixing degree on voice feature parameters in voice conversion. , which limits the improvement space of voice conversion effect, the reason is that the statistical distribution of voice feature parameters of different people is not considered, and the GMM mixing degree is closely related to the statistical distribution of feature parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-quality voice conversion method
  • High-quality voice conversion method
  • High-quality voice conversion method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The present invention will be further described in detail below in conjunction with the accompanying drawings.

[0030] The high-quality speech conversion method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for speech conversion, and the conversion part is used to realize the conversion of the source speaker's voice into the target speaker's voice.

[0031] 1), such as figure 1 , the implementation steps of the training part:

[0032] 1-1) Obtain the parallel corpus of the voice of the source speaker and the target speaker, and the acquisition of the parallel corpus can use the open source CMU ARCTIC corpus of Carnegie Mellon University;

[0033] 1-2) The present invention uses the AHOcoder speech analysis model to extract the speech Mel cepstral coefficient (MFCC, Mel-Frequency Cepstral Coefficient) and the logarithmic pitch frequency parameter logf of the source speaker and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-quality voice conversion method. A system firstly realizes training and classification of a speaker feature characteristic parameter (MFCC) through replacing a K-Means algorithm in a traditional GMM model by a self-organizing clustering algorithm and EM algorithm iterative cycling. Then trainings of double-linear-frequency bending and amplitude companding are performed for obtaining a conversion function required for voice conversion. Then the conversion function is utilized for performing high-quality voice conversion. The high-quality voice conversion method aims at correlation between a voice characteristic parameter spatial distribution condition and a Gaussian mixed model, and the iterative self-organizing clustering algorithm is utilized for realizing mixing degree determining, thereby settling a problem of low accuracy in performing voice characteristic parameter classification by the Gaussian mixed model. Furthermore the improved Gaussian mixed model is combined with double-linear-frequency bending and amplitude companding, thereby establishing a high-quality voice conversion system. The high-quality voice conversion method has high practical value in a voice conversion field.

Description

technical field [0001] The invention relates to the field of voice conversion, in particular to a high-quality voice conversion system and its realization method. Background technique [0002] Speech conversion refers to changing the voice personality characteristics of the source speaker to make it have the voice personality characteristics of the target speaker, that is, even if the voice spoken by one person after conversion sounds like the voice spoken by another person, while retaining semantics. There are usually two indicators to measure the effect of voice conversion: similarity (the similarity between the converted voice and the target speaker's voice personality characteristics) and clarity (the sound quality of the converted voice). [0003] Typical speech conversion methods include: a statistical mapping method represented by the Gaussian Mixture Model (GMM, Gaussian Mixture Model), which uses the minimum mean square error (MMSE, Minimum Mean Squared Error) crite...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/24G10L25/18G10L25/27G10L25/48G10L15/06G10L15/14G10L13/02
CPCG10L13/02G10L15/063G10L25/18G10L25/24G10L25/27G10L25/48
Inventor 李燕萍崔立梅吕中良
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products