Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech synthesis method and device, electronic equipment and readable storage medium

A speech synthesis and speech technology, which is applied in the fields of electronic equipment and readable storage media, devices, and speech synthesis methods, can solve problems such as poor synthesis effect, and achieve the effect of reducing the number of elements and ensuring the processing effect.

Pending Publication Date: 2022-07-19
BEIJING SINOVOICE TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a speech synthesis method, device, electronic equipment and readable storage medium, so as to solve the problem of poor synthesis effect when synthesizing long texts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method and device, electronic equipment and readable storage medium
  • Speech synthesis method and device, electronic equipment and readable storage medium
  • Speech synthesis method and device, electronic equipment and readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

preparation example Construction

[0025] figure 1 It is a flow chart of the steps of a speech synthesis method provided by an embodiment of the present invention, and the method includes:

[0026] Step 101: Acquire phoneme features of the text to be synthesized.

[0027] The text to be synthesized is the text that needs to be synthesized by speech. The phoneme feature of the text to be synthesized is obtained through a series of processing based on the to-be-synthesized text, and the processing process may be completed by a front end and a pre-processing structure (Encoder pre-net), which is not limited in this embodiment of the present invention. Specifically, the text to be synthesized may obtain a phoneme sequence through front-end processing, and the phoneme sequence enters the preprocessing structure of the encoder for preprocessing to obtain the phoneme features of the text to be synthesized. Specifically, the preprocessing operation may be to shape the input phoneme sequence, and the phoneme feature m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a speech synthesis method and device, electronic equipment and a readable storage medium. The method comprises the steps of obtaining phoneme features of a to-be-synthesized text; taking the phoneme features as input of a trained acoustic model, and generating acoustic information corresponding to the phoneme features based on a processing layer in the acoustic model; the processing layer comprises at least one pair of first sampling module and second sampling module; the first sampling module is used for reducing the feature dimension, and the second sampling module is used for recovering the feature dimension; the trained acoustic model is obtained based on sample text training with the length not smaller than a preset length threshold value; and acquiring voice corresponding to the to-be-synthesized text based on the acoustic information. Therefore, the first sampling module and the second sampling module in the acoustic model are used for firstly reducing the feature dimension of the phoneme sequence participating in calculation and then recovering the feature dimension after processing, so that speech synthesis can be directly carried out without carrying out forced sentence segmentation on a long text to a certain extent.

Description

technical field [0001] The present invention belongs to the technical field of speech, and in particular relates to a speech synthesis method, apparatus, electronic device and readable storage medium. Background technique [0002] With the continuous development of computer technology, human-computer voice interaction technology has also made great progress, and in the field of human-computer voice interaction, there are two key technologies of speech recognition and speech synthesis. Speech synthesis is to convert input text information into corresponding speech. In the application scenarios of speech synthesis, long text synthesis is often encountered, such as news scenarios. [0003] In the prior art, speech synthesis is generally performed after the text is forcibly segmented. This synthesis method will result in poor final synthesis effect. SUMMARY OF THE INVENTION [0004] The present invention provides a speech synthesis method, device, electronic device and readab...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/08G10L13/04
CPCG10L13/02G10L13/08G10L13/04
Inventor 李婉李健武卫东陈明
Owner BEIJING SINOVOICE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products