Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice recognition method and system based on incremental word graph re-scoring

A technology of speech recognition and word map, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problems of insufficient recognition speed, large final word map, and difference influence, so as to improve recognition accuracy, realize self-adaptation, Speed ​​up the generated effect

Pending Publication Date: 2020-11-10
XI AN JIAOTONG UNIV
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the existing method, in order to ensure the recognition accuracy of one-pass decoding, it is necessary to use a larger beam search width, which will make the final word map too large and the recognition speed is still not fast enough
There is a method to replace the one-step decoding method with a larger beam search width. Although the decoding speed is about 2-3 times higher at lower WERs, there may be a large difference between the two-step decoding because the beam value is too small to affect its final result. use
The method of using GPU parallel computing is relatively expensive, and the wide-scale use of this decoder in industrial scenarios is still open to question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice recognition method and system based on incremental word graph re-scoring
  • Voice recognition method and system based on incremental word graph re-scoring
  • Voice recognition method and system based on incremental word graph re-scoring

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] First, the methods and terms involved in the present invention will be described.

[0059] 1) Finite state receiver (FSA): Weighted finite state transcription machine (WFST) consists of a set of states and directed jumps between states, in which three kinds of information are saved on each jump, namely input label, output label and weight, recorded in the format of "input_label:output_label / weight", the decoding network mentioned in the present invention is a WFST. FSA can be seen as a simplification of FST, each of its jumps has only input labels.

[0060] 2) State-level word graph: a directed acyclic graph with input labels, output labels, and weight values ​​on transition edges. The input label is the alignment information, and the output label is the word result.

[0061] 3) Word-level word graph: also called compressed word graph, which is obtained by determinizing the state-level word graph. The difference from the state-level word graph is that its alignment in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice recognition method and system based on incremental word graph re-scoring. The method comprises the steps that: a to-be-recognized voice signal is obtained and acousticfeatures are extracted; a likelihood probability corresponding to the acoustic features is calculated by using a trained acoustic model; a decoder constructs a corresponding decoding network, obtainsa word graph of a state level from the decoding network and obtains a word graph of a word level by updating the word graph and determining the word graph; state-level word graphs of remaining decoding networks are determined, and the determined state-level word graphs are combined with the obtained word-level word graphs to generate a decoded word graph; a target word graph is obtained through afinite-state transcriber merging algorithm according to a re-scoring language model obtained through one-time decoded word graph and small corpus training; and an optimal cost path word graph of the target word graph is obtained, then a corresponding word sequence is obtained, and the word sequence is taken as a final recognition result. According to the invention, the calculation amount of determination after the decoding of a common decoder is finished is reduced, and the decoding speed is accelerated; the word error rate of speech recognition in a specific scene is reduced, and the accuracyis improved.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and in particular relates to a speech recognition method and system based on incremental word graph re-scoring. Background technique [0002] In recent years, with the rapid development of the artificial intelligence industry, speech recognition technology has received more and more attention from academia and industry. As a front-end technology in the field of speech interaction, speech recognition plays a vital role. It is widely used in many human-computer interaction systems, such as intelligent customer service systems, chat robots, personal intelligent assistants, and smart homes. [0003] At present, the traditional speech recognition technology is mainly built based on the HMM-DNN framework. The advantage of such modeling is that a speech recognition system with good accuracy can be obtained through relatively small data training. The decoder is an extremely important compone...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/02G10L15/183G10L15/26
CPCG10L15/02G10L15/183
Inventor 范建存马一航
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products