Non-periodic component syllable model building and speech synthesizing method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A non-periodic component and model building technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of poor spectral coherence of aperiodic components, large amount of data, and low quality of synthesized audio.

Inactive Publication Date: 2015-01-14

CHINA MOBILE COMM GRP CO LTD

View PDF6 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0017] The embodiment of the present invention provides a method and device for establishing an aperiodic component syllable model and speech synthesis, which are used to solve the problem of the large amount of data in the HMM speech model in the prior art and the aperiodic component spectrum of the synthesized speech information poor coherence, leading to the problem that the sound quality of the synthesized sound is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0069] like figure 1 As shown, it is a schematic flow chart of a method for establishing a non-periodic component syllable model in Embodiment 1 of the present invention, and the method includes:

[0070] Step 101: Obtain the original voice wave file in the voice database.

[0071] Specifically, in step 101, the voice database includes a large number of original voice waveform files and annotation files corresponding to the original voice waveform files, for example: files in Wav format and corresponding file identifiers (ie Lable).

[0072] Wherein, there is a one-to-one correspondence between the annotation file and the original voice waveform file, that is to say, each original voice waveform file corresponds to a unique annotation file.

[0073] Before preparing to build the aperiodic component syllable model, a large number of original speech waveform files are obtained from the speech database, and after analysis and processing, the required language parameter model, th...

Embodiment 2

[0123] like figure 2 As shown, it is a schematic flowchart of a speech synthesis method based on an aperiodic component syllable model in Embodiment 2 of the present invention. Embodiment 2 of the present invention is implemented on the basis of Embodiment 1 of the present invention. The method includes:

[0124] Step 201: Use a text analysis device to convert the acquired text information to be speech-synthesized into an original speech waveform file, and obtain an annotation file of the original speech waveform file according to the converted original speech waveform file.

[0125] Specifically, in step 201, after acquiring the text information to be synthesized into speech, it is necessary to use a text analysis device to convert the acquired text information to be synthesized into an original waveform file, and obtain the original voice according to the converted original voice waveform file. Annotation file for wave files.

[0126] Step 202: According to the correspondi...

Embodiment 3

[0136] like image 3 As shown, it is a schematic structural diagram of an aperiodic component syllable model building device in Embodiment 3 of the present invention. Embodiment 3 of the present invention is an invention under the same concept as Embodiment 1 of the present invention and Embodiment 2 of the present invention. The equipment includes: aperiodic component representative value determination module 11, aperiodic component spectrum fitting curve generation module 12 and aperiodic component syllable model building module 13, wherein:

[0137] Aperiodic component representative value determining module 11 is used to decompose the original speech waveform file in the voice database, and obtain the aperiodic component spectrum information, fundamental frequency information and vocal tract spectrum information of each syllable in the original speech waveform file; and according to Preset at least one frequency band information divided for each frame of the syllable and t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a non-periodic component syllable model building and speech synthesizing method and device. The method includes the steps that according to a non-periodic component representative value, of each frame of each syllable in an original speech waveform file, on each piece of frequency band information obtained through dividing, a non-periodic component spectrum fitting curve, of each syllable, on the selected frequency band information is obtained through a discrete cosine transform method, and a non-periodic component syllable model including the non-periodic component spectrum fitting curves, of all the syllables of the original speech waveform file, on the different frequency band information is generated, so that the data information, including the frequency band number *syllable frame number, in the syllable model is converted into the fitting curves including the number of frequency bands, the scale of speech model building is downsized, the system resources are saved, meanwhile, the non-periodic component spectrum fitting curve of each syllable is built, the continuity among frames of the syllables is fully considered, the original tone quality of the syllables is kept through the fitting curves, and the quality of the synthetic speech is improved in the synthesis process.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a method and device for establishing a non-periodic component syllable model and speech synthesis. Background technique [0002] Speech synthesis technology refers to the technology of generating artificial voice through mechanical and electronic methods. For example: TTS (Text To Speech, text-to-speech) technology, which converts text information into voice information, and plays the converted voice information through a playback device. [0003] The premise of speech synthesis is to analyze speech information, for example: speech parametric analysis. The so-called speech parametric analysis methods include direct waveform analysis and speech parametric analysis. At present, the more common speech analysis method is the speech parametric analysis method. The so-called voice parametric analysis method refers to a method for analyzing the extracted voice parameters, w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L13/04

Inventor 王朝民刘琨焦伟

Owner CHINA MOBILE COMM GRP CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Non-periodic component syllable model building and speech synthesizing method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology