Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Convolutional neural network matching text recognition method based on attention enhancement mechanism

A convolutional neural network and text recognition technology, which is applied in the fields of artificial intelligence and natural language processing, can solve the problems of ignoring convolution operations, lack of performance, and insufficient recognition and matching accuracy, and achieve the effect of increasing interaction and improving performance

Pending Publication Date: 2019-10-01
TONGJI UNIV
View PDF7 Cites 44 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But the current common attention mechanism ignores the convolution operation
Therefore, the attention mechanism does not play its due role in the convolutional neural network, resulting in insufficient accuracy in dealing with complex texts in paper plagiarism checks, search engines, and intelligent customer service systems. high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional neural network matching text recognition method based on attention enhancement mechanism
  • Convolutional neural network matching text recognition method based on attention enhancement mechanism
  • Convolutional neural network matching text recognition method based on attention enhancement mechanism

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0060] First, the text data is preprocessed for word segmentation and stop word removal. Use the language model to train the word vector of each word from the word-segmented text, and the dimension of the word vector can be 100 or 300 dimensions. The dimension of the word vector is denoted by d. Each input sentence is scaled to a fixed length n by padding or truncating, where n is the average length of the sentence or the maximum length of the sentence in the training set.

[0061] A window with a convolution kernel size of k can be defined as: in Represents the window corresponding to position i in sentence X, that is, the adjacent k word vectors centered on the i-th word. The traditional convolutional neural network sentence matching model extracts window features as follows:

[0062]

[0063] To obtain contextual information. The size of the convolution kernel is selected as 2, 3, 4 and 5. Then through the maximum pooling operation, the most important features ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a convolutional neural network matching text recognition method based on an attention enhancement mechanism, and the method comprises the steps: 1, carrying out the preprocessing of an input text, and carrying out the pre-training of a text corpus, and obtaining an initial word vector; 2, converting sentences in the input text into a matrix composed of initial word vectorsby utilizing the initial word vectors; 3, encoding the matrix through a convolutional neural network with an attention enhancement mechanism and generating a low-dimensional sentence vector; and 4, obtaining the correlation of the low-dimensional sentence vectors corresponding to every two sentences, and identifying the overall text according to the correlation result. Compared with the prior art, the defect that two sentences are completely independent in the sentence modeling process is avoided, the relevant attention information in the other sentence is added on the basis that the convolutional neural network acquires the local context information, so that the two sentences interact as soon as possible, and multi-granularity information obtained by combining convolution kernels of different sizes is obtained.

Description

technical field [0001] The invention relates to the technical fields of artificial intelligence and natural language processing, in particular to a text recognition method based on a convolutional neural network matching of an enhanced attention mechanism. Background technique [0002] With the advent of the era of big data, massive amounts of data are generated every day, and a large amount of irrelevant data is hidden in these data. It is obviously impossible to view these data one by one manually. How to quickly filter out junk information from these data and quickly search for the content that users need has become an increasingly urgent problem for people. At present, various deep learning techniques have been widely used in various natural language processing tasks. Sentence matching task is the basic task to realize natural language processing. The so-called sentence matching is to calculate the semantic relationship between two sentences. Plagiarism detection of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F16/33G06N3/04
CPCG06F16/3344G06F40/289G06F40/30G06N3/045Y02D10/00
Inventor 向阳徐诗瑶单光旭杨力刘芮辰
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products