Voice identifying system and compression method of characteristic vector set for voice identifying system

A speech recognition and feature vector sequence technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of slow response speed of speech recognition system, inability to meet actual use, and a large amount of calculation, so as to reduce the amount of decoding operations and reduce The effect of storage capacity and improving recognition speed

Inactive Publication Date: 2005-02-23
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, if CDHMM is used in a speech recognition system with a large vocabulary, the Gaussian probability needs to be calculated multiple times when the decoding operation unit performs decoding. Usually, the amount of calculation required in the decoding process is concentrated on the calculation of the Gaussian probability, which requires a lot of Calculations
When large-vocabulary speech recognition is performed on embedded hardware platforms with limited resources such as mobile phones, the response speed of the speech recognition system will be very slow, which cannot meet the needs of actual use.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice identifying system and compression method of characteristic vector set for voice identifying system
  • Voice identifying system and compression method of characteristic vector set for voice identifying system
  • Voice identifying system and compression method of characteristic vector set for voice identifying system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The following first describes the compression method of the feature vector set used in the speech recognition system.

[0054] There are many voice features, such as LPC coefficients, cepstrum coefficients, filter bank coefficients, Mel filter frequency coefficients, (Mel filter frequency coefficients, MFCC), etc. The commonly used feature parameter is MFCC. Here we don’t care which This invention can be applied to any characteristic parameter. For ease of understanding, the following uses MFCC coefficients as an example to illustrate the compression method of the feature vector set used in the speech recognition system of the present invention.

[0055] Assuming that each frame of speech uses L MFCC parameters, L first-order difference MFCC parameters and L second-order difference MFCC parameters are combined into 3*L=X-dimensional vectors as feature parameters, forming an X-dimensional voice feature set, correspondingly The dimension of the Gaussian normal distribution in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

In the process of obtaining codebook by using the clustering algorithm for the voice eigenvector set, the step of dynamic uniting and splitting subclass is added based on the number of vectors in the set and the total distance measured of vectors in the set. Thus, the sum of the distance measured between the vectors and the corresponding code words in the set is reduced after clustering algorithm to raise accuracy of the clustering algorithm. The method guarantees the recognition performance and reduces the storage quantity of the system. The invention also discloses a voice recognition system, which uses the feature codebook to replace the acoustic model. Gauss probability is not needed to calculate in the decoding process, and the probability value can be obtained by looking up the probability table stored in advance. Thus, the quantity of decoding operation is reduced greatly as well as the recognition speed is raised.

Description

Technical field [0001] The invention relates to a speech recognition system and a compression method of a feature vector set used in the speech recognition system. Background technique [0002] Almost all current speech recognition systems use methods based on statistical pattern recognition. In all speech recognition systems, it is necessary to convert the time-domain sound waves of the speech input into a digitized vector feature to describe and distinguish different pronunciations, which we call Speech feature, based on the feature to build a sound model for all pronunciations, which we usually call an acoustic model in the field of speech recognition. All speech recognition systems must have an acoustic model; at the same time, for large vocabulary continuous speech recognition systems, a language model is also required. The purpose of speech recognition is to give a string of sound feature sequences as input conditions, use acoustic models and language models, and use search...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L15/08G10L19/038
Inventor 潘接林韩疆刘建颜永红庹凌云张建平
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products