Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech enhancement method, device and equipment

A speech enhancement and speech data technology, applied in speech analysis, speech recognition, neural learning methods, etc., can solve the problems of enhanced speech distortion, poor noise generalization, etc., to reduce speech distortion, narrow differences, and improve the quality of hearing. Effect

Pending Publication Date: 2022-05-17
ALIBABA GRP HLDG LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present application provides a speech enhancement method to solve the existing problems of enhanced speech distortion and poor generalization of noise in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech enhancement method, device and equipment
  • Speech enhancement method, device and equipment
  • Speech enhancement method, device and equipment

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0119] Please refer to figure 1 , which is a schematic flowchart of an embodiment of the speech enhancement method of the present application. The execution subject of the method is a speech enhancement device, which is usually deployed at the server end, but is not limited to the server end, and can also be any device capable of implementing the speech enhancement method. In this embodiment, the method may include the following steps:

[0120] Step S101: Determine the acoustic feature data of the first noisy speech data to be processed.

[0121] The noisy speech data may be single-channel speech data, which may be collected by a microphone. The method separates speech from background noise (environmental noise), and can be applied to various speech processing systems, such as speech recognition systems, speaker recognition systems, speech recognition text editing systems, and the like.

[0122] Please see figure 2 , which is a schematic diagram of a usage scenario of an ...

no. 2 example

[0151] In the above embodiments, a speech enhancement method is provided, and correspondingly, the present application also provides a speech enhancement device. The device corresponds to the embodiment of the above-mentioned method. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, refer to the part of the description of the method embodiment. The device embodiments described below are illustrative only.

[0152] The present application additionally provides a speech enhancement device, including:

[0153] an acoustic feature extraction unit, configured to determine the acoustic feature data of the first noisy speech data to be processed;

[0154] The acoustic feature enhancement unit is used to determine the enhanced acoustic features of the first noisy speech data according to the acoustic feature data through the acoustic feature enhancement model; wherein, the acoustic feature enhance...

no. 3 example

[0157] In the foregoing embodiments, a speech enhancement method is provided, and correspondingly, the present application also provides an electronic device. The device corresponds to the embodiment of the above-mentioned method. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiment. The device embodiments described below are illustrative only.

[0158] An electronic device in this embodiment, the electronic device includes: a processor and a memory; the memory is used to store a program for implementing the voice enhancement method, and after the device is powered on and runs the program of the method through the processor, the following steps are executed: Step: determine the acoustic feature data of the first noisy speech data to be processed; through the acoustic feature enhancement model, according to the acoustic feature dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech enhancement method, device and equipment. According to the method, an acoustic feature enhancement model is obtained through a self-supervised noise classification loss adversarial multi-task learning mode, and enhanced acoustic features of noisy voice are determined through the model, so that the condition that the enhanced acoustic features are relatively sensitive to environmental noise when being extracted can be avoided; therefore, the difference of speech enhancement performance among various environmental noises can be effectively reduced, and the generalization of external noise of a training set is improved. Besides, in the processing mode, voice synthesis is carried out on the enhanced acoustic features through the vocoder to obtain enhanced voice of the noisy voice, so that direct or indirect enhancement of the phase spectrum of the noisy voice is avoided; therefore, the voice distortion can be effectively reduced, and the hearing quality of the voice is improved.

Description

technical field [0001] This application relates to the technical field of speech processing, in particular to a speech enhancement method and device, a speech recognition method, device and system, a speech recognition text editing system, a speech enhancement model processing method and device, an acoustic feature enhancement model processing method and device, and a user identification method and devices, and electronic equipment. Background technique [0002] In the field of machine recognition such as speech recognition and speaker recognition, noise will greatly affect the recognition accuracy. In order to improve the accuracy of speech recognition, speaker recognition, etc., the speech can be separated from the background noise through single-channel speech enhancement technology, and then speech recognition, speaker recognition, etc. are processed based on the enhanced speech data. [0003] At present, a typical speech enhancement scheme is to perform single-channel ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/02G10L15/26G10L25/24G06K9/62G06N3/08
CPCG10L21/02G10L15/26G10L25/24G06N3/08G06F18/24
Inventor 杜志浩雷鸣张仕良
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products