Speaker recognition method based on convolution neural network and spectrogram

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and speaker recognition technology, applied in the field of speaker recognition based on convolutional neural network, can solve problems such as difficulty in training and short speech, and achieve the effect of less hardware cost and resources, easy implementation, and simple and fast calculation.

Inactive Publication Date: 2017-07-14

BEIJING UNIV OF TECH

View PDF6 Cites 52 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Considering that the actual training speech is short, it is difficult to train a GMM model for each speaker separately

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0023] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0024] The speaker audio data set has 24 speakers who read the numbers 0-9 respectively, and the following operations are performed on the speaker audio data set.

[0025] S1 generates a spectrogram operation:

[0026] Step 1: Obtain the sampling frequency, left and right channels by reading the sound signal.

[0027] Step 2: Store these data in an array and calculate the length.

[0028] Step 3: Perform windowing processing on the frequency division data, where the overlap ratio is 50%, and save the data

[0029] Step 4: Perform Fourier transform on the frequency-divided data

[0030] Step 5: Display the spectrogram through an array.

[0031] S2 deep learning stage operation:

[0032] Step 1: Convert the voice signal of the audio file into a spectrogram through code;

[0033] Step 2: After getting these spectrograms, run GenerateTrainAnd...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speaker recognition method based on a convolution neural network and a spectrogram, and the method comprises the following steps: firstly collecting an audio signal of each speaker; secondly converting the audio signals into the spectrogram; thirdly taking an image as an input layer, and training the neural network through AlexNet training; fourthly adjusting the weight values and biases of all layers of the neural network layer by layer through a reverse propagation algorithm; finally obtaining the parameters of the neural network, and classifying the speakers. The method achieves the quick recognition of the speakers through a convolution neural network processing method.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and relates to a speaker recognition method based on a convolutional neural network. Background technique [0002] With the development of information technology, high technology has been integrated into our life in the form of digitization, which brings a lot of convenience and also promotes the development of digital life. The identification technology has also undergone tremendous changes, from the traditional password verification method to more emerging technologies such as digital certificates and biometric authentication. Especially biometric technology, because it uses the inherent physiological or behavioral characteristics of the human body as the identification basis for individual verification, overcomes the shortcomings of traditional authentication methods that are easy to be lost, forgotten, and easily counterfeited. extensive attention of researchers at home and abroad...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L17/18G10L17/04

Inventor 李玉鑑穆红章

Owner BEIJING UNIV OF TECH

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speaker recognition method based on convolution neural network and spectrogram

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology