Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method and device for prosody model used for speech synthesis

A technology of speech synthesis and model training, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of prosodic pause prediction synthesis effect not being smooth and natural, poor user experience, etc., to achieve perfect prosodic model, prosodic pause fluent and natural The effect of accuracy

Active Publication Date: 2015-08-26
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The wrong prediction of rhythmic pauses caused the final synthesis effect of the sentence to be unsmooth and natural, resulting in poor user experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device for prosody model used for speech synthesis
  • Training method and device for prosody model used for speech synthesis
  • Training method and device for prosody model used for speech synthesis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0021] The prosodic model training method and device for speech synthesis and the speech synthesis method and device according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0022] figure 1 is a flowchart of a prosodic model training method for speech synthesis according to an embodiment of the present invention.

[0023] Such as figure 1 As shown, the prosodic model training method for speech synthesis may include:

[0024] S1. Extract text features and tag f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a training method and device for a prosody model used for speech synthesis, wherein the training method for the prosody model used for speech synthesis comprises the following steps: S1, extracting textual features and marker features corresponding to participles from a training corpus text; S2, generalizing the participles in the training corpus text on the basis of Chinese thesaurus; S3, training the prosody model according to the textual features, the marker features and the generalized participles. According to the training method and device for the prosody model used for speech synthesis, by extracting the textual features and marker features corresponding to participles from the training corpus text, generalizing the participles in the training corpus text on the basis of Chinese thesaurus and then training the prosody model according to the textual features, the marker features and the generalized participles, the prosody model is more perfect, and further the prosody prediction accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of text-to-speech conversion, in particular to a prosody model training method and device for speech synthesis. Background technique [0002] Speech synthesis, also known as text-to-speech technology, is a technology that can convert text information into speech and read it aloud. With the continuous advancement of science and technology, the application of speech synthesis is becoming more and more extensive, such as news and information broadcasting, audio novels, etc. In daily life, text messages, emails and other information can also be synthesized into voice through speech synthesis, providing users with an additional way to obtain information. [0003] In the speech synthesis system, prosody prediction is the basis of the whole system, if the prosody pause prediction is wrong, it will directly affect the effect of speech synthesis. For example: the synthesized text is "if a passer-by handed it an emp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/10
Inventor 徐扬凯李秀林
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products