Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Language input association detection method based on attention model

A technology of attention model and detection method, applied in the input/output process of data processing, speech analysis, speech recognition, etc., can solve the problems of increasing the amount of model parameters, unable to explicitly give word correlation, adding hidden nodes, etc. , to achieve the effect of improving performance

Active Publication Date: 2017-11-17
AISPEECH CO LTD
View PDF5 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention proposes a language based on the attention model in view of the defects that the existing technology cannot explicitly provide the correlation between words, adding hidden nodes will linearly increase the parameter amount of the entire model, and cannot fully utilize all historical information. Input correlation detection method, introduce additional control unit in the model, explicitly input historical sequence and additional information, use an attention-based method to automatically extract the correlation between predicted words and them

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language input association detection method based on attention model
  • Language input association detection method based on attention model
  • Language input association detection method based on attention model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] This embodiment includes the following steps:

[0042] Step 101, collect the training corpus required for training the language model, and do preprocessing: first, you need to consider the needs of the application, and collect corpus data for the corresponding field. Collect the corpus of spoken telephone. Convert the vocabulary in the corpus into its corresponding numerical serial number in the vocabulary, and replace the vocabulary that does not appear in the vocabulary in the corpus with , and return the corresponding serial number. At the same time, 10% of the data is selected as the validation set to prevent the model from overfitting.

[0043] Step 102, processing corresponding data and generating corresponding labels. For example, the word sequence is w 1 ,w 2 ,...,w n-1 ,w n , then the training sequence is w 1 ,w 2 ,...,w n-1 , the corresponding label sequence is w 2 ,...,w n-1 ,w n , where: the training and labeling sequences are in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a language input association detection method based on attention model. The method includes the following steps: acquiring training linguistic data which is used for training a language model and pre-processing the linguistic data, labeling each word sequence data in the linguistic data; then training a recurrent neural network in a language model by using the labeled training sequence, training the updated language model by using all data sets in the training linguistic data, if the probability distribution of prediction words that are obtained is in convergence in a validation set, completing the training of the language model; and eventually scoring the input sentences by using the trained language model so as to obtain the relationship between the words. According to the invention, the method herein automatically extracts prediction words and relationships among the prediction words based on attention. The method herein can also introduce grammar and semantic information in the process of training word vectors, enables word vector to contain more information, and further increases the performances of the language model through technology expectation.

Description

technical field [0001] The invention relates to a technology in the field of voice input, in particular to a method for detecting language input relevance based on an attention model. Background technique [0002] In recent years, research on recurrent neural networks has become more and more popular. Based on the long short-term memory neural network (LSTM) of the gated memory unit, the gated recurrent unit neural network (GRU) is widely used in the field of natural language processing. The LSTM neural network adds memory units, input gates, output gates, and forget gates, and the GRU neural network adds reset gates and update gates. These gates and memory units greatly improve the modeling of long-distance dependencies between words. Effect. [0003] However, such models still have some limitations. The modeling of historical information in the gate-based neural network is encoded in the hidden layer. When the hidden layer needs to contain more information, the number o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/16G10L15/18G06F3/023
CPCG06F3/0237G10L15/063G10L15/16G10L15/18
Inventor 俞凯曹迪
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products