Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device

A feature extraction and speech signal technology, applied in speech analysis, instruments, etc., can solve the problems of insufficient feature information and poor echo cancellation effect, and achieve the effect of improving the effect and comprehensive feature information.

Pending Publication Date: 2021-12-31
WUHAN UNIV
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention proposes a feature extraction method, device, and echo cancellation method and device based on the time domain and frequency domain of speech signals, which are used to solve or at least partially solve the problem that the feature information extracted in the existing method is not comprehensive enough, and the final echo cancellation effect is not good. technical issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device
  • Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device
  • Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] The embodiment of the present invention provides a feature extraction method based on the speech signal time domain and frequency domain, including:

[0059] S1: Calculate the time weight vector according to the intermediate mapping feature, and expand the time weight vector to the dimension equal to the intermediate mapping feature, wherein the intermediate mapping feature is the time-frequency feature of the speech signal through a multi-layer convolutional neural network After transformation, the time weight vector contains important time frame information in speech features;

[0060] S2: Perform a Hadamard product of the intermediate mapping feature and the time weight vector to obtain a time-domain weighted mapping feature;

[0061] S3: Calculate a frequency weight vector according to the time-domain weighted mapping feature, and expand the frequency weight vector to a dimension equal to the time-domain weighted mapping feature, wherein the frequency weight vector ...

Embodiment 2

[0075] Based on the same inventive concept, this embodiment provides a feature extraction device based on the time domain and frequency domain of speech signals, the device is an attention module, including:

[0076] A time-domain attention module is used to calculate a time weight vector according to the intermediate mapping feature, and expand the time weight vector to a dimension equal to the intermediate mapping feature, wherein the intermediate mapping feature is passed through by the time-frequency feature of the speech signal After multi-layer convolutional neural network transformation, the time weight vector contains important time frame information in speech features;

[0077] A time-domain weighting module, configured to perform a Hadamard product on the intermediate mapping feature and the time weight vector to obtain a time-domain weighted mapping feature;

[0078] A frequency-domain attention module, configured to calculate a frequency weight vector according to ...

Embodiment 3

[0084] Based on the same inventive concept, this embodiment provides an echo cancellation method, including:

[0085] S101: Use the short-time Fourier transform to calculate the real and imaginary parts of the far-end reference signal and the near-end microphone signal, and stack the real and imaginary parts of the far-end reference signal and the near-end microphone signal in the channel dimension to form a four-dimensional input initial acoustic characteristics of the channel;

[0086] S102: Using two-dimensional convolution based on the complex number field for the initial acoustic features to obtain intermediate mapping features;

[0087] S103: Perform feature extraction on the intermediate mapping features to obtain mapping features weighted in the time domain and frequency domain;

[0088] S104: Perform temporal feature learning on the intermediate mapping features to obtain time-modeled features;

[0089] S105: Obtain a complex number domain ratio mask according to th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a feature extraction method and device based on voice signal time domain and frequency domain, and an echo cancellation method and device, and the method comprises the steps: carrying out the short-time Fourier transform of a voice signal to obtain a time-frequency domain feature, and obtaining an intermediate mapping feature through a multilayer convolutional neural network; then obtaining a time weight vector based on a time domain attention module, expanding the time weight vector to the same dimension as the intermediate mapping feature, then performing Hadamard product to obtain a mapping feature subjected to time domain weighting, and then obtaining a frequency weight vector by using a frequency domain attention module; and expanding the time-weighted mapping features to the dimension same as the time-weighted mapping features, and performing Hadamard product to obtain the final time-domain and frequency-domain weighted mapping features. The time domain attention module and the frequency domain attention module can be easily embedded into the acoustic echo cancellation model based on the convolutional neural network, so that the model adaptively learns the weight of the time domain feature and the frequency domain feature, and the performance of the model is improved.

Description

technical field [0001] The invention relates to the field of audio signal processing, in particular to a feature extraction method and device based on time domain and frequency domain of speech signals, and an echo cancellation method and device. Background technique [0002] In two-way voice communication, acoustic echo occurs when the far-end signal played by the near-end speaker is picked up by the near-end microphone and sent back to the far-end. Acoustic echo greatly affects the customer's call experience and voice follow-up processing such as speech recognition, so how to eliminate acoustic echo as much as possible without introducing near-end voice distortion has become a research hotspot in the field of voice front-end processing at home and abroad. In recent years, deep learning methods have achieved great success in the field of echo cancellation beyond traditional adaptive filtering methods. [0003] In the process of implementing the present invention, the inven...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0224G10L21/0232G10L25/30
CPCG10L21/0224G10L21/0232G10L25/30G10L2021/02082
Inventor 涂卫平韩畅刘雅洁肖立杨玉红刘陈建树
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products