Speech separation method for single microphone based on NMF (Non-negative Matrix Factorization) algorithm

A single microphone, speech separation technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of inability to model the time continuity of speech signals, difficult speech separation, etc., achieve good speech separation effect, improve robustness, The effect of improving the separation effect

Inactive Publication Date: 2018-09-25
INST OF ACOUSTICS CHINESE ACAD OF SCI
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method also has some problems. For example, the algorithm assumes that the adjacent frames of the speech signal are independent of each other, and cannot model the temporal continuity of the speech signal. Moreover, the algorithm uses a relatively Large dictionary modeling, resulting in one speaker's dictionary may describe another speaker's speech signal, making speech separation difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech separation method for single microphone based on NMF (Non-negative Matrix Factorization) algorithm
  • Speech separation method for single microphone based on NMF (Non-negative Matrix Factorization) algorithm
  • Speech separation method for single microphone based on NMF (Non-negative Matrix Factorization) algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] Based on the above single-microphone speech separation method, in this embodiment, the method includes two operations of model training and speech signal separation, refer to Figure 2a As shown, the model training part of the present invention specifically includes the following steps:

[0053] Step 101) Separately collect a large number of pure speech signals for the two speakers as training data for the model.

[0054] Step 102) Preprocess the speech signal collected in step 101), and then extract the spectrum of the speech signal through Fast Fourier Transform (FFT), and the spectrum information includes the amplitude spectrum.

[0055] The process of preprocessing the speech signal includes: first padding each frame of speech signal to N points, N=2 i , i is an integer, and i≥8; then, perform windowing or pre-emphasis processing on the speech signal of each frame, and the windowing function may use a Hamming window (hamming) or a Haning window (hanning).

[0056]...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a speech separation method for a single microphone based on an NMF (Non-negative Matrix Factorization) algorithm. According to the method, a lot of small dictionary matrixes anda state sequence are obtained in allusion to training data of each speaker so as to simultaneously describe spectral structure information and time continuity of speech signals; in allusion to different frames of mixed speeches, the algorithm provided by the invention adopts different small dictionary matrixes to describe the amplitude spectrum of each frame of speeches compared with the traditional algorithm which adopts a large dictionary matrix, thereby avoiding the occurrence of a phenomenon that the dictionary of one speaker describes speech information of another speaker, and improvingthe robustness and voice separation effect of the algorithm.

Description

technical field [0001] The invention relates to the technical field of speech separation, in particular to a single-microphone speech separation method based on an NMF algorithm. Background technique [0002] In many application scenarios (such as automatic speech recognition, voice communication), the speech signal is inevitably affected by the surrounding interference, and among all kinds of interference, the interference generated by the non-target speaker has a similar spectral structure to the target speech , making it more difficult to remove, so it is necessary to design an algorithm specially designed for this kind of interference noise. Moreover, many hearing devices (or instruments) usually only have one microphone to pick up the speech signal, and the algorithm needs to separate the speech signals of two speakers from a mixed speech. This is an underdetermined problem, which further increases the difficulty of solving the problem. [0003] In recent years, a vari...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0272G10L21/0308G10L25/27G10L15/02G10L15/06
CPCG10L15/02G10L15/063G10L21/0272G10L21/0308G10L25/27G10L2015/0633
Inventor 李军锋李煦颜永红
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products