Speech recognition method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and feature parameter technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of heavy tailing feature optimization, inability to retain the inherent nonlinear structure and characteristics of high-dimensional data, etc., and achieve excellent dimensionality reduction effect Effect

Inactive Publication Date: 2017-01-04

SUZHOU UNIV

View PDF4 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Technical problem to be solved: Aiming at the shortcomings of heavy tailing phenomenon in feature extraction and feature optimization that cannot retain the inherent nonlinear structure and characteristics of the original high-dimensional data in the existing speech recognition method, the present invention provides a speech recognition method, The feature parameter GCWT proposed by this method is better than the traditional feature parameter MFCC, and the improved dynamic weighted local linear embedding method DWLLE has better dimension reduction effect than the traditional LLE method.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0036] In the absence of feature optimization, speech recognition includes two steps: feature extraction and recognition using a classifier.

[0037] 1. Feature extraction:

[0038] The existing feature parameter MFCC and the feature parameter GCWT of the present invention are respectively extracted for speech.

[0039] 1. Feature parameter MFCC extraction steps:

[0040] (1) After the signal S(n) is pre-emphasized, the Hamming window is used for windowing and framing, and the signal x of each frame is obtained n (m), and then get its spectrum X by short-time Fourier transform n (k), then find the square of the spectrum, that is, the energy spectrum P n (k).

[0041] P n (k)=|X n (k)| 2

[0042] (2) Use M Mel bandpass filters to P n (k) Perform filtering, since the effects of the components in each frequency band are superimposed in the human ear, so the energy in each filter frequency band is superimposed.

[0043] S n ( ...

Embodiment 2

[0058] A speech recognition method includes three steps of feature extraction, feature optimization and recognition using a classifier.

[0059] 1. Feature extraction The steps of feature parameter GCWT extraction in Embodiment 1 are the same.

[0060] 2. Feature optimization:

[0061] Using the nonlinear dimensionality reduction method LLE for dimensionality reduction processing includes the following three steps:

[0062] (1) For a given source data set X={x 1 ,x 2 ,...,x n},x i ∈R D , using the Euclidean distance to find the

[0063] k (k

[0064] (2) Calculate the local reconstruction weight matrix of the sample point from the neighboring points of the sample point to minimize the reconstruction error;

[0065] (3) Calculate the low-dimensional embedding of the sample set according to the local reconstruction weight matrix and its neighbor points.

[0066] LLE uses the Euclidean distance to find the neighborhood under the sample unif...

Embodiment 3

[0070] A speech recognition method includes three steps of feature extraction, feature optimization and recognition using a classifier.

[0071] 1. Feature extraction The steps for extracting feature parameters GCWT in Embodiment 1 are the same.

[0072] 2. Feature optimization:

[0073] GCWT is dimensionally reduced using the nonlinear dimensionality reduction method DWLLE. During dimensionality reduction, the parameters are set as follows: k = 7 , σ = 1 7 , θ = 0.8 . The main process includes:

[0074] (1) Use Euclidean distance to find k (k<n) neighbor points of each sample point;

[0075] (2) Calculate the radial basis kernel function between the sample point and the neighbor point:

[0076] u ij = k ( x i , x j ) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech recognition method, which comprises the steps of feature extraction, feature optimization and recognition carried out by using a classifier. The step of feature extraction comprises that time-frequency analysis is carried out on speech through adopting multi-scale continuous wavelet transform, Gaussian mixture modeling is carried out on a wavelet coefficient along the direction of a scale axis to acquire a feature parameter GCWT, and then recognition is carried out on the voice; the step of feature optimization comprises that dimension reduction processing is carried out on the feature parameter GCWT by adopting a dynamic weighted locally linear embedding (DWLLE) method. The feature parameter GCWT provided by the invention is better than a traditional feature parameter MFCC, and a dimension reduction effect of the DWLLE method is better than that of LLE (Locally Linear Embedding).

Description

technical field [0001] The invention belongs to the technical field of voice recognition, in particular to a voice recognition method. Background technique [0002] The speech recognition process mainly includes feature extraction, feature optimization and recognition using classifiers. In terms of feature extraction, the performance of the speech recognition system is closely related to the feature parameters used by the recognizer. The commonly used feature parameters are mainly line spectrum pair LSP, relative spectrum (RASTA), linear predictive cepstral coefficient LPCC, Mel cepstrum MFCC, energy, Fourier cepstrum and the corresponding dynamic feature parameters, etc. [0003] Wavelet analysis can automatically adjust the time resolution and frequency resolution according to the speed of signal changes. A small number of coefficients in the wavelet coefficients contain most of the energy of the signal, and most of the coefficients are near zero, which contribute little...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/02

Inventor 常静雅陶智张晓俊赵鹤鸣顾济华吴迪

Owner SUZHOU UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech recognition method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology