Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Incremental-type speech online synthesis method based on statistic parameter model

A technology of statistical parameters and synthesis methods, applied in the field of speech, can solve the problems of inability to apply real-time requirements for online applications, degradation of synthesized speech quality, and influence of synthesis quality, etc., to control the loss of dynamic information, improve real-time performance, and ensure the effect of quality.

Inactive Publication Date: 2012-07-18
AISPEECH CO LTD
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the traditional HMM-based speech synthesis, the acoustic parameters involved in model training must be combined with the dynamic correlation information between the parameters of the previous and subsequent frames. Therefore, HMM generally models the entire paragraph or sentence to be synthesized, which leads to the fact that in practical applications In general, it is necessary to wait until the entire sentence is completely generated before playing or transmitting the next step; if the synthesized text is randomly segmented and only a small segment of speech is generated each time, the quality of the synthesized speech will be greatly reduced, which makes Traditional HMM-based speech synthesis cannot be applied to online applications with high real-time requirements
[0004] For this problem, there are few technical solutions studied at home and abroad. The main method is to forcibly bundle several phoneme model sequences to synthesize speech in segments [T. Dutoit, A Streaming Architecture for Statistical Parametric Speech Synthesis, 2011]. The number is artificially set, not flexible enough, and has a great impact on the synthesis quality
There are no relevant patents at home and abroad involving this issue

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Incremental-type speech online synthesis method based on statistic parameter model
  • Incremental-type speech online synthesis method based on statistic parameter model
  • Incremental-type speech online synthesis method based on statistic parameter model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, so as to make a clearer and clearer definition of the protection scope of the present invention.

[0028] Refer to figure 1 , figure 2 with image 3 , figure 1 It is the work flow chart of the incremental speech online synthesis system based on statistical parameter model of the present invention; figure 2 Is the working flow chart of the model sequence segmentation method of the present invention; image 3 It is the working flow chart of the parameter generation, speech synthesis, and audio playback / transmission pipeline of the present invention.

[0029] An incremental voice online synthesis method based on statistical parameter models, which includes: text analysis to obtain the entire model sequence parameters corresponding...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an incremental-type speech online synthesis method based on a statistic parameter model. The incremental-type speech online synthesis method comprises the following steps of: analyzing a text to obtain a model state level parameter sequence corresponding to a synthesized text input by a user; segmenting a state sequence, searching an optimal segmentation position of an acoustic model state sequence, and segmenting the state level parameter sequence according to the segmentation position; and carrying out parameter generation, speech synthesis and audio play / transmission on all segmented state level parameter sequences according to a text sequence and in a pipeline manner, and outputting continuous synthesized speech on line. According to the incremental-type speech online synthesis method based on the static parameter model, the time delay required by synthesizing a section of text speech and playing or transmitting is shortened, the synthesis speed can be changed very flexibly according to actual requirements, the dynamic information loss caused by segmentation is controlled to the maximum degree, and the quality of the synthesized speech is ensured.

Description

Technical field [0001] The invention relates to the technical field of speech, in particular to an incremental speech online synthesis method based on a statistical parameter model. Background technique [0002] The speech synthesis method based on the statistical parameter model is one of the current mainstream speech synthesis technologies. The method of speech synthesis based on statistical parameter model [A. Black, Statistical parametric speech synthesis, 2007], the speech signal needs to be analyzed parametrically, which generally includes the pitch frequency parameter that characterizes the excitation information and the non-periodic component and the character channel filter The spectral parameters of the spectral characteristics are then statistically modeled on the analyzed parameters. The statistical model generally uses the Hidden Markov Model (HMM). During synthesis, the trained model is used to predict the relevant acoustic parameters, and finally the speech signal...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/14G10L15/26G10L13/08G10L13/027
Inventor 俞凯王欢良钱诗君
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products