Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A feature extraction and speech signal technology, applied in speech analysis, instruments, etc., can solve the problems of insufficient feature information and poor echo cancellation effect, and achieve the effect of improving the effect and comprehensive feature information.

Pending Publication Date: 2021-12-31

WUHAN UNIV

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The present invention proposes a feature extraction method, device, and echo cancellation method and device based on the time domain and frequency domain of speech signals, which are used to solve or at least partially solve the problem that the feature information extracted in the existing method is not comprehensive enough, and the final echo cancellation effect is not good. technical issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0058] The embodiment of the present invention provides a feature extraction method based on the speech signal time domain and frequency domain, including:

[0059] S1: Calculate the time weight vector according to the intermediate mapping feature, and expand the time weight vector to the dimension equal to the intermediate mapping feature, wherein the intermediate mapping feature is the time-frequency feature of the speech signal through a multi-layer convolutional neural network After transformation, the time weight vector contains important time frame information in speech features;

[0060] S2: Perform a Hadamard product of the intermediate mapping feature and the time weight vector to obtain a time-domain weighted mapping feature;

[0061] S3: Calculate a frequency weight vector according to the time-domain weighted mapping feature, and expand the frequency weight vector to a dimension equal to the time-domain weighted mapping feature, wherein the frequency weight vector ...

Embodiment 2

[0075] Based on the same inventive concept, this embodiment provides a feature extraction device based on the time domain and frequency domain of speech signals, the device is an attention module, including:

[0076] A time-domain attention module is used to calculate a time weight vector according to the intermediate mapping feature, and expand the time weight vector to a dimension equal to the intermediate mapping feature, wherein the intermediate mapping feature is passed through by the time-frequency feature of the speech signal After multi-layer convolutional neural network transformation, the time weight vector contains important time frame information in speech features;

[0077] A time-domain weighting module, configured to perform a Hadamard product on the intermediate mapping feature and the time weight vector to obtain a time-domain weighted mapping feature;

[0078] A frequency-domain attention module, configured to calculate a frequency weight vector according to ...

Embodiment 3

[0084] Based on the same inventive concept, this embodiment provides an echo cancellation method, including:

[0085] S101: Use the short-time Fourier transform to calculate the real and imaginary parts of the far-end reference signal and the near-end microphone signal, and stack the real and imaginary parts of the far-end reference signal and the near-end microphone signal in the channel dimension to form a four-dimensional input initial acoustic characteristics of the channel;

[0086] S102: Using two-dimensional convolution based on the complex number field for the initial acoustic features to obtain intermediate mapping features;

[0087] S103: Perform feature extraction on the intermediate mapping features to obtain mapping features weighted in the time domain and frequency domain;

[0088] S104: Perform temporal feature learning on the intermediate mapping features to obtain time-modeled features;

[0089] S105: Obtain a complex number domain ratio mask according to th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a feature extraction method and device based on voice signal time domain and frequency domain, and an echo cancellation method and device, and the method comprises the steps: carrying out the short-time Fourier transform of a voice signal to obtain a time-frequency domain feature, and obtaining an intermediate mapping feature through a multilayer convolutional neural network; then obtaining a time weight vector based on a time domain attention module, expanding the time weight vector to the same dimension as the intermediate mapping feature, then performing Hadamard product to obtain a mapping feature subjected to time domain weighting, and then obtaining a frequency weight vector by using a frequency domain attention module; and expanding the time-weighted mapping features to the dimension same as the time-weighted mapping features, and performing Hadamard product to obtain the final time-domain and frequency-domain weighted mapping features. The time domain attention module and the frequency domain attention module can be easily embedded into the acoustic echo cancellation model based on the convolutional neural network, so that the model adaptively learns the weight of the time domain feature and the frequency domain feature, and the performance of the model is improved.

Description

technical field [0001] The invention relates to the field of audio signal processing, in particular to a feature extraction method and device based on time domain and frequency domain of speech signals, and an echo cancellation method and device. Background technique [0002] In two-way voice communication, acoustic echo occurs when the far-end signal played by the near-end speaker is picked up by the near-end microphone and sent back to the far-end. Acoustic echo greatly affects the customer's call experience and voice follow-up processing such as speech recognition, so how to eliminate acoustic echo as much as possible without introducing near-end voice distortion has become a research hotspot in the field of voice front-end processing at home and abroad. In recent years, deep learning methods have achieved great success in the field of echo cancellation beyond traditional adaptive filtering methods. [0003] In the process of implementing the present invention, the inven...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L21/0224G10L21/0232G10L25/30

CPCG10L21/0224G10L21/0232G10L25/30G10L2021/02082

Inventor 涂卫平韩畅刘雅洁肖立杨玉红刘陈建树

Owner WUHAN UNIV

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Feature extraction method and device based on voice signal time domain and frequency domain, and echo cancellation method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology