Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of Boltzmann machine and joint spectrum, which is applied in speech synthesis, speech analysis, speech recognition, etc., and can solve problems such as over-smoothing

Active Publication Date: 2013-11-27

UNIV OF SCI & TECH OF CHINA

View PDF4 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0008] The technical solution of the present invention: In order to improve the over-smoothing problem in the existing sound conversion method, a sound conversion method based on joint spectral modeling of restricted Boltzmann machines is provided, which improves the accuracy of spectral modeling and improves the conversion of speech Sound quality and naturalness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0112] In the present invention, according to the idea of joint spectrum modeling based on restricted Boltzmann machine, the specific process of realizing sound conversion is obtained as follows: figure 1 shown. Different from using a single Gaussian to describe each acoustic subspace in the GMM-based model conversion method, the present invention uses a restricted Boltzmann machine (RBM) model for description. In the RBM training module, RBMs with different structures can be used according to the specific form of the RBM model, such as Gaussian-Bernoulli RBM, Gaussian-Gaussian RBM, etc.

[0113] Restricted Boltzmann machine (see R.Salakhutdinov, "Learning deep generative models," Ph.D. dissertation, University of Toronto, 2009.) is a machine with An undirected graph model of a two-layer structure, which consists of a set of visible random variables v=[v 1 , v 2 ,...,v V ] T Nodes and a set of hidden random variables h=[h 1 , h 2 ,...,h H ] T Nodes, V and H are the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Disclosed is a voice conversion method of united frequency-spectrum modeling based on a restricted boltzman machine. The method comprises the implementation steps: extracting voice spectrum envelope characteristics, extracting voice high-layer spectrum characteristics, conducting dynamic time warping, training a GMM, dividing acoustic subspaces of the united spectrum envelope characteristics, training a Gaussian-Bernouslli RBM or a Gaussian-Gaussian RBM, converting frequency spectrums and synthesizing conversion voices. According to the voice conversion method of the united frequency-spectrum modeling based on the restricted boltzman machine, the precision of the frequency-spectrum modeling is improved, and the tone quality and the naturalness of the conversion voices are improved.

Description

technical field [0001] The invention relates to a sound conversion method in speech synthesis, in particular to a sound conversion method based on joint spectrum modeling of a restricted Boltzmann machine (Restricted Boltzmann Machine, RBM). Background technique [0002] The purpose of voice conversion (also known as speaker conversion) is to transform the speech of one speaker (source speaker) to make it sound like another speaker (target speaker) while keeping the semantics of the speech unchanged. Currently, joint spectrum modeling based on Gaussian Mixture Model (GMM) (see Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol.6, no.2, pp.131-142, Mar.1998.) is the mainstream method for voice conversion. The main principle of this method is to use multiple Gaussian distributions to fit the joint spectral feature probability distribution of the source and target according to the maximum l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/033G10L15/06

Inventor 刘利娟陈凌辉凌震华戴礼荣

Owner UNIV OF SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology