High-quality voice conversion method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voice conversion, high-quality technology, applied in the field of high-quality voice conversion systems, can solve problems such as limiting the space for improving the effect of voice conversion, and achieve the effect of high-quality voice conversion

Active Publication Date: 2017-08-29

NANJING UNIV OF POSTS & TELECOMM

View PDF4 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Scholar Daniel Erro combined GMM and FW technology, and the converted voice obtained achieved a good balance in terms of voice similarity and sound quality. However, Daniel Erro used GMM to perform soft classification training with fixed mixing degree on voice feature parameters in voice conversion. , which limits the improvement space of voice conversion effect, the reason is that the statistical distribution of voice feature parameters of different people is not considered, and the GMM mixing degree is closely related to the statistical distribution of feature parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The present invention will be further described in detail below in conjunction with the accompanying drawings.

[0030] The high-quality speech conversion method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for speech conversion, and the conversion part is used to realize the conversion of the source speaker's voice into the target speaker's voice.

[0031] 1), such as figure 1 , the implementation steps of the training part:

[0032] 1-1) Obtain the parallel corpus of the voice of the source speaker and the target speaker, and the acquisition of the parallel corpus can use the open source CMU ARCTIC corpus of Carnegie Mellon University;

[0033] 1-2) The present invention uses the AHOcoder speech analysis model to extract the speech Mel cepstral coefficient (MFCC, Mel-Frequency Cepstral Coefficient) and the logarithmic pitch frequency parameter logf of the source speaker and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a high-quality voice conversion method. A system firstly realizes training and classification of a speaker feature characteristic parameter (MFCC) through replacing a K-Means algorithm in a traditional GMM model by a self-organizing clustering algorithm and EM algorithm iterative cycling. Then trainings of double-linear-frequency bending and amplitude companding are performed for obtaining a conversion function required for voice conversion. Then the conversion function is utilized for performing high-quality voice conversion. The high-quality voice conversion method aims at correlation between a voice characteristic parameter spatial distribution condition and a Gaussian mixed model, and the iterative self-organizing clustering algorithm is utilized for realizing mixing degree determining, thereby settling a problem of low accuracy in performing voice characteristic parameter classification by the Gaussian mixed model. Furthermore the improved Gaussian mixed model is combined with double-linear-frequency bending and amplitude companding, thereby establishing a high-quality voice conversion system. The high-quality voice conversion method has high practical value in a voice conversion field.

Description

technical field [0001] The invention relates to the field of voice conversion, in particular to a high-quality voice conversion system and its realization method. Background technique [0002] Speech conversion refers to changing the voice personality characteristics of the source speaker to make it have the voice personality characteristics of the target speaker, that is, even if the voice spoken by one person after conversion sounds like the voice spoken by another person, while retaining semantics. There are usually two indicators to measure the effect of voice conversion: similarity (the similarity between the converted voice and the target speaker's voice personality characteristics) and clarity (the sound quality of the converted voice). [0003] Typical speech conversion methods include: a statistical mapping method represented by the Gaussian Mixture Model (GMM, Gaussian Mixture Model), which uses the minimum mean square error (MMSE, Minimum Mean Squared Error) crite...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/24G10L25/18G10L25/27G10L25/48G10L15/06G10L15/14G10L13/02

CPCG10L13/02G10L15/063G10L25/18G10L25/24G10L25/27G10L25/48

Inventor 李燕萍崔立梅吕中良

Owner NANJING UNIV OF POSTS & TELECOMM

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

High-quality voice conversion method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology