Voice processing method, device and equipment and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voice processing and voice technology, applied in the computer field, can solve the problems of consuming large computer resources and time, lack of training data, and insufficient retraining of voice conversion models, etc., to reduce the occupation and time consumption of computing resources, and lower the application threshold , Improve the effect of voice processing efficiency

Active Publication Date: 2021-04-27

BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

View PDF6 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The present disclosure provides a speech processing method, device, device and storage medium to at least solve the problem in the related art that consumes a lot of computer resources and time caused by retraining the speech conversion model when the target speaker changes. and at least one problem of insufficient training data to retrain the speech conversion model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0203] As an optional implementation manner, the reconstruction module includes:

[0204] The reconstruction sub-module is configured to call the vocoder to reconstruct the waveform of the target speech feature to obtain the converted target speech.

[0205] As an optional implementation manner, the speech processing model is obtained through training in the following manner:

[0206] Obtain a training set, the training set includes at least one speech sample pair, each speech sample pair includes a first speech sample and a second speech sample, and the first speech sample and the second speech sample are different utterances of the same speaker;

[0207] Invoking the encoder in the basic speech processing model, respectively encoding the first speech sample and the second speech sample pair in each speech sample, respectively obtaining the first sample features corresponding to the first speech sample pair, and the second speech sample pair The second sample feature corresp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a voice processing method, device and equipment and a storage medium. The method comprises the steps: obtaining a to-be-processed first voice and a to-be-processed second voice; calling an encoder in a voice processing model obtained by performing optimization training based on at least one target speaker statement to encode the obtained voice, and respectively obtaining a first feature representing text information irrelevant to the identity of the speaker and a second feature representing tone information of the target speaker; and performing decoding and voice reconstruction based on the first feature and the second feature to obtain a target voice after tone conversion. Thus, through an end-to-end voice processing model, the voice processing model does not need a large number of target speaker statements, and the tone modeling ability of the target speaker can be completed only based on a small number of utterances, so that the occupation and time consumption of computing resources for model training are reduced.

Description

technical field [0001] The present disclosure relates to the field of computer technology, in particular to a voice processing method, device, equipment and storage medium. Background technique [0002] Speech conversion refers to the conversion of the original speaker's timbre of speech into the target speaker's timbre while keeping the language content unchanged. Speech conversion plays an important role in video voice change, video dubbing, human-computer interaction and other fields. [0003] In related technologies, existing speech recognition systems are usually trained using a large number of data sets. When the target speaker changes, it is necessary to obtain a large amount of data to retrain a voice conversion model, which not only consumes a lot of computer resources and time, but also in some special scenarios, especially in the voice data of the new target speaker. In rare cases, it is not sufficient to retrain a speech translation model to a new target speake...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/013

CPCG10L21/013G10L2021/0135

Inventor 张颖

Owner BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voice processing method, device and equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology