A method and system for providing sound bank hybrid training model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for training models and sound libraries, applied in the direction of dot-dash line transmission devices, etc., can solve the problem of high cost and achieve the effect of reducing requirements, reducing costs, and making the process of training models easier

Active Publication Date: 2015-08-12

BEIJING SINOVOICE TECH CO LTD

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Due to the relatively high quality requirements for the speaker's recording, a high-level announcer is required, and the cost is high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] In view of the deficiencies in the prior art, the present invention proposes a sound bank mixed training model method, which can solve some or all of the aforementioned problems, and can establish a relatively stable model. The method of a mixed training model provided by the present invention: firstly select several speakers to record sound banks, and when training the model, mix multiple sound banks to train the model, that is, put the sound bank data of several speakers together for training. The advantage is that training with multiple speakers will blur the shortcomings of a single speaker, and the final trained model tends to be an average of multiple speakers, thus obtaining a more stable model. Secondly, each speaker has its own characteristics, through mixed training, different advantages can be combined. Third, the parameter characteristics of real speakers are not optimal, and training with multiple speakers can significantly optimize the speech synthesis eff...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for providing a sound-library hybrid training model, which comprises the following steps: according to a selected sound-recording corpus used as a sample, acquiring sound signals of at least two speakers so as to obtain at least two sets of sound-recording data; extracting the parameter information of a sound from each set of sound-recording data, wherein the parameter information of the sound comprises at least one of pitch, spectrum and duration; and carrying out statistical analysis on the sound parameters so as to obtain a parameter model. The invention also discloses a corresponding system for providing a sound-library hybrid training model. According to the invention, based on the existing sound synthesis technology, in the process of model training, a plurality of sound-library hybrid training models, namely, the sound library data of a plurality of speakers, are placed together and trained, and finally, the trained model tends to the average parameter of a plurality of speakers or the optimal parameter of a single speaker, thereby obtaining a relatively stable model. By using the method and system disclosed by the invention, the requirements on speakers can be reduced, and the cost of sound recording can be reduced; meanwhile, the model training process can be completed more easily, so that the synthetic sound is more natural.

Description

technical field [0001] The invention relates to the technical field, in particular to a method and a system for providing a sound bank mixed training model. Background technique [0002] Speech synthesis is an important technology to realize natural and efficient human-computer interaction. Speech synthesis technology is TTS. Simply put, it is to let the computer "speak", which is to use the computer to convert any combination of text files into sound files, and output the sound through multimedia devices, that is, to automatically convert any text into voice information and play it to the audience. user. There are two most common speech synthesis methods today, one is a synthesis method based on unit selection and waveform splicing, and the other is a parametric synthesis method based on an acoustic statistical model. [0003] In the traditional unit selection algorithm, the target cost and connection cost are often realized by calculating the difference of the context at...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): H04L15/06

Inventor 李健郑晓明张连毅武卫东

Owner BEIJING SINOVOICE TECH CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A method and system for providing sound bank hybrid training model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology