Speech enhancement method and device, equipment and storage medium

A voice enhancement and voice data technology, applied in voice analysis, instruments, etc., can solve problems such as poor reliability and large resource consumption, and achieve the effects of small memory footprint, reduced calculation load, and improved accuracy

Active Publication Date: 2021-06-11
BEIJING UNISOUND INFORMATION TECH +1
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The present invention provides a voice enhancement method, device, equipment and storage medium to solve the technical problems of poor reliability of voice enhancement results and large resource consumption in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech enhancement method and device, equipment and storage medium
  • Speech enhancement method and device, equipment and storage medium
  • Speech enhancement method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The principles and features of the present invention will be described below with reference to the accompanying drawings. The examples are only used to explain the present invention, but not to limit the scope of the present invention.

[0053] figure 1 is a flow chart of an embodiment of the speech enhancement method of the present invention, such as figure 1 As shown, the speech enhancement method of this embodiment may specifically include the following steps:

[0054] 100. Convert the audio signal of each channel in the acquired voice data to obtain a frequency domain signal of each channel;

[0055] In a specific implementation process, the audio signal of each channel in the acquired speech data can be framed and windowed, and further converted by Short-Time Fourier Transform (STFT) to obtain each channel. channel frequency domain signal.

[0056] 101. Perform signal regularization according to the phase of the frequency domain signal of each channel to obtain ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a voice enhancement method and device, equipment and a storage medium. The method comprises the steps: converting an audio signal of each channel in obtained voice data to obtain a frequency domain signal of each channel; carrying out the signal normalization according to the phase of the frequency domain signal of each channel so as to obtain a structured signal, only associated with a topological structure of a microphone array, of each channel; training the to-be-trained CGMM by using the sample structured signal, corresponding to the sample data with the preset length, of each channel to obtain target CGMM; and determining the time-frequency mask information of the voice data by using the target CGMM. Therefore, unified modeling of frequency domain signals of all channels is realized, the calculation amount is reduced, the memory occupation amount is small, the resource consumption is further reduced, the sorting problem caused when multiple CGMMs exist is avoided, the accuracy of the obtained mask information is improved, and the reliability of the voice enhancement result is improved.

Description

technical field [0001] The present invention relates to the technical field of speech recognition, in particular to a speech enhancement method, apparatus, device and storage medium. Background technique [0002] At present, speech enhancement technology is an indispensable part of speech signal processing, which can improve the signal-to-noise ratio of audio signals, so that speech enhancement is less affected by noise. The beamforming method is the most effective method in the field of multi-channel signal enhancement in the speech enhancement technology. [0003] Usually, the mask information of time-frequency points is obtained by the Complex Gaussian Mixture Model (CGMM), and after calculating the speech covariance matrix and the noise covariance matrix, the Minimum Variance Distortionless Response, MVDR) for speech enhancement. [0004] However, obtaining the mask information of time-frequency points through CGMM faces two problems: [0005] First, when each frequen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0232G10L21/0216
CPCG10L21/0232G10L21/0216G10L2021/02166
Inventor 关海欣梁家恩
Owner BEIJING UNISOUND INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products