Speaker recognition method based on Gaussian mixture model embedded with time delay neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A Gaussian mixture model and speaker recognition technology, applied in the field of speaker recognition

Inactive Publication Date: 2011-04-27

戴红霞 +2

View PDF0 Cites 34 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, at present, GMM and TDNN are only used for speaker recognition alone, and there is no method that combines the respective advantages of the two to better improve the effect of speaker recognition.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0069] The technical solutions of the present invention will be further described below in conjunction with the drawings and embodiments.

[0070] figure 1 It is a training and recognition model for speaker recognition embedded in TDNN network. It is different from the baseline GMM model (only GMM model is used as speaker recognition) in terms of training and recognition.

[0071] 1. Preprocessing and feature extraction

[0072] First, a method based on energy and zero-crossing rate is used for silence detection, and spectral subtraction is used to remove noise, and then f(Z)=1-0.97Z -1 The filter is pre-emphasized, and the Hamming window with a length of 20ms and a window shift of 10ms is used to divide the frame into a 20th-order linear prediction (LPC) analysis, and then the 13th-order cepstral coefficient is obtained from the 20th-order LPC coefficient for speaker recognition. eigenvectors of .

[0073] 2. Speaker model training

[0074] During training, the process o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speaker recognition method based on a Gaussian mixture model (GMM) embedded with a time delay neural network (TDNN). In the speaker recognition method, the advantages of the TDNN and the GMM are fully considered, the TDNN is embedded into the GMM, and solves a residual of input and output vectors of the TDNN by fully utilizing the time sequence of an input characteristic vector through the conversion of a time delay network, and the residual modifies the training of the GMM through an expectation maximization method; besides, a likelihood probability is acquired by a modified GMM model parameter and the residual, and a TDNN parameter is modified by an inertial backward inversion method so as to ensure that parameters of the GMM and the TDNN are alternately updated. An experiment shows that: a recognition rate of the method is improved to a certain extent compared with that of a baseline GMM under various signal to noise ratios.

Description

technical field [0001] The invention relates to a speaker recognition method, in particular to a speaker recognition method based on a Gaussian mixture model embedded in a time-delay neural network. Background technique [0002] In access control, credit card transactions and court evidence, automatic speaker recognition, especially text-independent speaker recognition, plays an increasingly important role. Its goal is to correctly determine the speech to be recognized as belonging to the speech library One of many references. [0003] In the method of speaker recognition, the method based on Gaussian Mixture Model (GMM) has been paid more and more attention. Because of its advantages of high recognition rate, simple training, and small requirement for training data, it has become the mainstream recognition method at present. Since the Gaussian mixture model (GMM) has a good ability to represent the distribution of data, as long as there are enough items and enough training...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/00G10L15/06G10L15/28G10L25/24

Inventor 戴红霞王吉林余华魏昕赵力

Owner 戴红霞

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speaker recognition method based on Gaussian mixture model embedded with time delay neural network

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A Gaussian mixture model and speaker recognition technology, applied in the field of speaker recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A Gaussian mixture model and speaker recognition technology, applied in the field of speaker recognition

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology