Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech recognition method

A speech recognition and feature parameter technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of heavy tailing feature optimization, inability to retain the inherent nonlinear structure and characteristics of high-dimensional data, etc., and achieve excellent dimensionality reduction effect Effect

Inactive Publication Date: 2017-01-04
SUZHOU UNIV
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Technical problem to be solved: Aiming at the shortcomings of heavy tailing phenomenon in feature extraction and feature optimization that cannot retain the inherent nonlinear structure and characteristics of the original high-dimensional data in the existing speech recognition method, the present invention provides a speech recognition method, The feature parameter GCWT proposed by this method is better than the traditional feature parameter MFCC, and the improved dynamic weighted local linear embedding method DWLLE has better dimension reduction effect than the traditional LLE method.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method
  • Speech recognition method
  • Speech recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] In the absence of feature optimization, speech recognition includes two steps: feature extraction and recognition using a classifier.

[0037] 1. Feature extraction:

[0038] The existing feature parameter MFCC and the feature parameter GCWT of the present invention are respectively extracted for speech.

[0039] 1. Feature parameter MFCC extraction steps:

[0040] (1) After the signal S(n) is pre-emphasized, the Hamming window is used for windowing and framing, and the signal x of each frame is obtained n (m), and then get its spectrum X by short-time Fourier transform n (k), then find the square of the spectrum, that is, the energy spectrum P n (k).

[0041] P n (k)=|X n (k)| 2

[0042] (2) Use M Mel bandpass filters to P n (k) Perform filtering, since the effects of the components in each frequency band are superimposed in the human ear, so the energy in each filter frequency band is superimposed.

[0043] S n ( ...

Embodiment 2

[0058] A speech recognition method includes three steps of feature extraction, feature optimization and recognition using a classifier.

[0059] 1. Feature extraction The steps of feature parameter GCWT extraction in Embodiment 1 are the same.

[0060] 2. Feature optimization:

[0061] Using the nonlinear dimensionality reduction method LLE for dimensionality reduction processing includes the following three steps:

[0062] (1) For a given source data set X={x 1 ,x 2 ,...,x n},x i ∈R D , using the Euclidean distance to find the

[0063] k (k

[0064] (2) Calculate the local reconstruction weight matrix of the sample point from the neighboring points of the sample point to minimize the reconstruction error;

[0065] (3) Calculate the low-dimensional embedding of the sample set according to the local reconstruction weight matrix and its neighbor points.

[0066] LLE uses the Euclidean distance to find the neighborhood under the sample unif...

Embodiment 3

[0070] A speech recognition method includes three steps of feature extraction, feature optimization and recognition using a classifier.

[0071] 1. Feature extraction The steps for extracting feature parameters GCWT in Embodiment 1 are the same.

[0072] 2. Feature optimization:

[0073] GCWT is dimensionally reduced using the nonlinear dimensionality reduction method DWLLE. During dimensionality reduction, the parameters are set as follows: k = 7 , σ = 1 7 , θ = 0.8 . The main process includes:

[0074] (1) Use Euclidean distance to find k (k<n) neighbor points of each sample point;

[0075] (2) Calculate the radial basis kernel function between the sample point and the neighbor point:

[0076] u ij = k ( x i , x j ) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a speech recognition method, which comprises the steps of feature extraction, feature optimization and recognition carried out by using a classifier. The step of feature extraction comprises that time-frequency analysis is carried out on speech through adopting multi-scale continuous wavelet transform, Gaussian mixture modeling is carried out on a wavelet coefficient along the direction of a scale axis to acquire a feature parameter GCWT, and then recognition is carried out on the voice; the step of feature optimization comprises that dimension reduction processing is carried out on the feature parameter GCWT by adopting a dynamic weighted locally linear embedding (DWLLE) method. The feature parameter GCWT provided by the invention is better than a traditional feature parameter MFCC, and a dimension reduction effect of the DWLLE method is better than that of LLE (Locally Linear Embedding).

Description

technical field [0001] The invention belongs to the technical field of voice recognition, in particular to a voice recognition method. Background technique [0002] The speech recognition process mainly includes feature extraction, feature optimization and recognition using classifiers. In terms of feature extraction, the performance of the speech recognition system is closely related to the feature parameters used by the recognizer. The commonly used feature parameters are mainly line spectrum pair LSP, relative spectrum (RASTA), linear predictive cepstral coefficient LPCC, Mel cepstrum MFCC, energy, Fourier cepstrum and the corresponding dynamic feature parameters, etc. [0003] Wavelet analysis can automatically adjust the time resolution and frequency resolution according to the speed of signal changes. A small number of coefficients in the wavelet coefficients contain most of the energy of the signal, and most of the coefficients are near zero, which contribute little...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/02
Inventor 常静雅陶智张晓俊赵鹤鸣顾济华吴迪
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products