Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice synthesis model training method and device, storage medium and electronic equipment

A speech synthesis and model technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve problems such as stiff speech synthesis

Pending Publication Date: 2021-01-29
BEIJING DA MI TECH CO LTD
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the inventor found that: the synthesized voice obtained through the speech synthesis model is relatively blunt, and the traces of "robot voice" are obvious, so how to make the synthesized voice more similar to the human voice is an urgent problem to be solved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis model training method and device, storage medium and electronic equipment
  • Voice synthesis model training method and device, storage medium and electronic equipment
  • Voice synthesis model training method and device, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the purpose, features, and advantages of the embodiments of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, The described embodiments are only some of the embodiments of the present application, but not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the scope of protection of this application.

[0048] When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a voice synthesis model training method. The method comprises the steps of: carrying out voice synthesis on text data based on an initial voice synthesis model to obtain a synthesized voice, carrying out emotion recognition on the synthesized voice based on a speaker classification network to obtain a first feature vector, performing emotion recognition onreal person voice corresponding to the text data based on the speaker classification network to obtain a second feature vector, comparing the first feature vector with the second feature vector, andupdating the network parameters of the initial voice synthesis model based on a comparison result to obtain a target voice synthesis model. Emotion recognition is carried out on the synthesized voicedata and the real person voice data through an emotion recognition network, the network parameters of the initial voice synthesis model are updated according to a feedback result, training of the initial voice synthesis model is completed, a target voice synthesis model is obtained, and training of the voice synthesis model is accurately realized.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a training method, device, storage medium and electronic equipment of a speech synthesis model. Background technique [0002] With the development of artificial intelligence technology, people pay more and more attention to speech synthesis technology. Synthesized speech is applied in various occasions, such as: speech broadcast on public transportation, replacing the teacher's roll call and reading questions in online teaching courses, etc. Weather broadcast, news broadcast and other occasions related to speech synthesis. However, the inventors found that: the synthesized voice obtained by the voice synthesis model is relatively blunt, and the traces of "robot voice" are obvious, so how to make the synthesized voice more similar to the human voice is an urgent problem to be solved. Contents of the invention [0003] The embodiment of the present application provides...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L15/02G10L15/06
CPCG10L13/02G10L15/02G10L15/063
Inventor 吴雨璇杨惠舒景辰梁光周鼎皓
Owner BEIJING DA MI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products