Improved voice reinforcement method based on multi-target criterion learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech enhancement and multi-objective technology, which is applied in speech analysis, speech recognition, instruments, etc., can solve the problems of poor training target effect and achieve the effect of easy implementation, SNR improvement, and optimization of the target function

Inactive Publication Date: 2019-07-26

TIANJIN UNIV

View PDF2 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

It is found that the training objective based on floating value masking is better than that based on binary masking in terms of enhancing the quality and intelligibility of speech, while the training objective based on spectral envelope is the least effective

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific example

[0047] (1) Signal preprocessing

[0048] 1. Select data:

[0049] Select 600 sentences from the TIMIT standard corpus as training pure speech, and the sampling frequency is 16KHz; select Factory, F16, White and Pink four kinds of noise from the Noisex-92 standard noise library as training noises, and the pure speech and noise are respectively mixed SNR -5dB, -2dB, 0dB and 2dB are mixed to get the training data set.

[0050] Select 120 sentences from the remaining sentences of TIMIT as the test set of pure speech, and the sampling rate is still 16KHz; select Factory from the Noisex-92 standard noise library as the test noise, and use -5dB, -2dB, 0dB and 2dB mixed signal-to-noise Ratio is mixed with pure speech to obtain a test data set.

[0051] 2. Framing and windowing

[0052] When the speech signal is divided into frames, the frame length is 320 points, the frame shift is 160 points, and the window function is Hamming window.

[0053] (2) Calculate the logarithmic power ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an improved voice reinforcement method based on multi-target criterion learning. The method comprises the steps of per-processing a signal, wherein a training data set and a testing data set are acquired and subjected to framing and windowing, and a window function type, framing duration and frame shift parameters are determined; calculating a logarithm power spectrum of each frame of signal of noisy voices in the training data set and the testing data set after framing and windowing; calculating a target function of multi-target training; training a deep neural network; testing the network, wherein the logarithm power spectrum of each frame of signal of the noisy voices of the testing data set is used as a characteristic and input to the deep neural network for testing the neural network; taking voice intelligibility, subjective voice quality evaluation and voice quality as evaluation indexes for the intelligibility, sensing effect and voice quality after voice enhancement respectively. According to the method, the adverse influence of phase information of the signals of the noisy voices on the intelligibility and voice quality of the enhanced voice is eliminated, and the method is convenient and easy to implement.

Description

technical field [0001] The invention relates to a voice enhancement method. In particular, it concerns an improved multi-objective criterion learning method for speech enhancement. Background technique [0002] Speech enhancement refers to the technology of extracting useful speech signals (that is, pure speech) from the noise background as much as possible when the speech signal is interfered or even submerged by various noises, while suppressing and reducing noise interference. In recent years, a variety of speech enhancement methods have been proposed, mainly including methods based on signal processing, methods based on statistical models, and methods based on model training. Among these methods, based on signal processing, spectral subtraction and Wiener filtering are the two most representative algorithms. When the background noise is correctly estimated, this type of method can achieve better speech enhancement performance. However, in In the real environment, espec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/60G10L25/21G10L25/24G10L25/30G10L25/45G10L21/02G10L15/06

CPCG10L15/063G10L21/0202G10L25/21G10L25/24G10L25/30G10L25/45G10L25/60

Inventor 张涛邵洋洋

Owner TIANJIN UNIV

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Improved voice reinforcement method based on multi-target criterion learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific example

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology