Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Controllable emotion speech synthesis method and system based on emotion category labels

A category label and speech synthesis technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of style information utilization defects, synthesis system flexibility and style controllability limitations, so as to improve controllability and flexibility , Improve naturalness and simulation, and improve the effect of decoupling

Pending Publication Date: 2021-08-31
SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the above methods have the following technical problems: (1) the flexibility and style controllability of the synthesis system are limited; (2) the synthesis system has defects in the use of style information in the corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Controllable emotion speech synthesis method and system based on emotion category labels
  • Controllable emotion speech synthesis method and system based on emotion category labels
  • Controllable emotion speech synthesis method and system based on emotion category labels

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0029] As mentioned in the background, the speech synthesis methods described in the related art have the following technical problems: lack of decoupling of speech style features and speech text content features, limited flexibility and style controllability of the synthesis system, and lack of orientation The speech emotional style learning method designed by emotional corpus has defects in the utilization of style information in the corpus by the synthesis system.

[0030] In view of the above technical problems, this embodime...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a controllable emotional speech synthesis system and method based on an emotional category label. The method comprises steps of text feature extraction, extracting speech text features from an input phoneme sequence; a voice style feature extraction step for receiving acoustic features of a target voice corresponding to the phoneme sequence and extracting voice style features from the acoustic features; a voice style characteristic memorizing step used for obtaining emotional style characteristics of the target voice according to the voice style characteristics; and an acoustic feature prediction step used for predicting synthetic emotional speech acoustic features according to the speech text features and the emotional style features. According to the method, the decoupling degree of the voice style characteristics and the voice text characteristics can be improved, so the style regulation and control result of the synthesized voice is not limited by the text content, controllability and flexibility of the synthesized voice are improved, and emotional labels and emotional data distribution information of the voice in the corpus can be effectively utilized; therefore, the voice style characteristics of each emotion can be extracted more efficiently.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence, in particular to a controllable emotion speech synthesis system and method based on emotion category labels. Background technique [0002] Emotion is an important paralinguistic information in human speech that reflects semantic information and speaker status in addition to text content. Emotional speech synthesis focuses on improving the richness of the speech output by the speech synthesis system in terms of expressiveness and the fidelity of the sense of hearing, thereby improving the naturalness of the synthesized speech, which is an important technical basis for improving the speech interaction experience. It has a variety of application prospects in various interactive scenarios such as novel generation. [0003] Among the traditional speech synthesis methods, the waveform splicing speech synthesis has high requirements on the corpus, and the sound quality and naturalness o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/027G10L13/06G10L13/08G10L25/63
CPCG10L13/027G10L13/06G10L13/08G10L25/63Y02D10/00
Inventor 吴志勇李翔
Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products