Single channel mixed speech time domain separation method based on Convolutional Neural Network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and mixed voice technology, applied in voice analysis, instruments, etc., can solve problems such as difficult phase recovery, separation quality to be improved, and mutual interference

Inactive Publication Date: 2017-06-13

DALIAN UNIV OF TECH

View PDF6 Cites 40 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the current neural network-based methods generally use a fully connected network (Full Connected Neural Network, FCNN) or a recurrent neural network (Recurrent Neural Network, RNN), and usually need to extract the amplitude spectrum features of the speech signal, which has not been well utilized. To the powerful feature expression ability of the convolutional neural network itself; at the same time, due to the use of the amplitude spectrum feature, it is faced with the difficult phase recovery problem when restoring the source signal

Therefore, the traditional neural network-based separation method has mutual interference between the separated two source signal estimates, and the separation quality needs to be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0048] The present invention will be further described below in conjunction with the drawings.

[0049] Such as figure 1 As shown, the time-domain separation method for single-channel mixed speech based on convolutional neural network includes the following steps:

[0050] Step 1. Establish a voice data set for training. Randomly select a large amount of voice data from a standard database, such as TSP voice database, and divide it into two groups. 80% of the voice data is used as training data and the remaining 20% is used as test data. ;

[0051] Step 2. Preprocess the voice data. First, use formula (1) to normalize the original voice data to the range [-1,1].

[0052]

[0053] Where s i Represents the i-th source signal, max(·) represents the maximum value, abs(s i ) Means pair s i Each element in takes the absolute value, y i Represents the normalized i-th source signal, and then uses formula (2) to process the time domain speech signal into frames. The frame length is N=1024, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a single channel mixed speech time domain separation method which is the single channel mixed speech time domain separation method based on a Convolutional Neural Network. The method comprises the following steps of 1, constructing a speech data set for training, 2, preprocessing speech data, 3, obtaining hybrid speech data, 4, constructing a neural network structure, 5, using data organized to train a neural network in a monitoring mode, and 6, using the trained neural network to carry out a separation test. According to the single channel mixed speech time domain separation method based on the Convolutional Neural Network, time domain speech signals serve as input and output of the Convolutional Neural Network, single channel hybrid speeches are separated, and therefore estimation for two source signals is obtained. The method does not need to deal with the problem of phase retrieval, and the separation quality of a single channel speech is improved.

Description

Technical field [0001] The invention relates to a time-domain separation method for single-channel mixed speech, and more specifically, to a time-domain separation method for single-channel mixed speech based on a convolutional neural network. Background technique [0002] Single-channel blind source separation (Monaural Blind Source Separation, MBSS) is an important technology in the field of speech processing. It can obtain estimates of two-channel source signals when only a single-channel mixed speech signal is obtained. Single-channel speech separation technology has important application value in speech recognition, speech enhancement, speech identification and other fields. [0003] Typical single-channel speech separation includes methods based on non-negative matrix factorization (NMF) and neural networks (Neural Network). Since the single-channel mixed speech contains less information, it is difficult to achieve satisfactory separation results based on non-negative matrix...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/0272G10L21/0224G10L25/30

CPCG10L21/0224G10L21/0272G10L25/30

Inventor 张鹏马晓红

Owner DALIAN UNIV OF TECH

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Single channel mixed speech time domain separation method based on Convolutional Neural Network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology