Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Voice synthesis method and device

A technology of speech synthesis and synthetic speech, which is applied in speech synthesis, speech analysis, instruments, etc., and can solve problems such as difficulty in learning the fundamental frequency trend, synthesizing speech rhythm, and insufficient expressive power

Active Publication Date: 2016-04-27
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF8 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the traditional speech synthesis system, the fundamental frequency modeling uses the multi-space probability distribution hidden Markov model (multi-space probability distribution HMM, MSD-HMM) modeling method, which can be very good for the state level, the sound level Modeling the fundamental frequency profile (or trend) of the basic frequency, but it is difficult to learn higher-level fundamental frequency trends such as words, phrases or sentences, which makes the rhythm and expressiveness of the synthesized speech insufficient.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis method and device
  • Voice synthesis method and device
  • Voice synthesis method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar modules or modules having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0026] figure 1 It is a schematic flowchart of a speech synthesis method proposed by an embodiment of the present invention. The process of this embodiment takes the synthesis process as an example. see figure 1 , the method includes:

[0027] S11: Perform text feature extraction on the text to be synthesized to obtain contextual feature information.

[0028] The process o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice synthesis method and a device. The voice synthesis method comprises steps of performing text characteristic extraction on a text to be synthesized to obtain the context characteristic information, obtaining a pre-generated model, wherein the pre-generated model is generated by training according to the context characteristic information of the training sample and converted acoustic parameter, and the converted acoustic parameters comprise a plurality of rhythm level fundamental frequency parameters, determining the model output parameter corresponding to the context characteristic information according to the model, wherein the model output parameters comprise a plurality of the rhythm level fundamental frequency parameters, performing the fundamental frequency reconstruction on the plurality of rhythm level fundamental frequency parameter, and synthesizing voice according to the parameter after the fundamental frequency reconstruction and the other parameters in the model output parameters. The method can improve the performance result of the synthesized speech.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a speech synthesis method and device. Background technique [0002] Now people are not only satisfied with the clarity and intelligibility of synthesized speech, but also require the synthesized speech to have better naturalness and expressiveness. In natural speech, fundamental frequency is the main factor affecting naturalness and expressiveness, so the accuracy of fundamental frequency modeling directly affects the naturalness and expressiveness of synthesized speech. [0003] In the traditional speech synthesis system, the fundamental frequency modeling uses the multi-space probability distribution hidden Markov model (multi-space probability distribution HMM, MSD-HMM) modeling method, which can be very good for the state level, the sound level However, it is difficult to learn higher-level fundamental frequency trends such as words, phrases or sentences, which make...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/033G10L13/047G10L13/10
Inventor 盖于涛康永国张少飞
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products