Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for text classification

A text classification and text technology, which is applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc. It can solve the problems of deep measurement layers, long iteration time, and insufficient extraction of effective features , to achieve the effect of improving the classification accuracy

Active Publication Date: 2017-03-29
GUILIN UNIV OF ELECTRONIC TECH
View PDF3 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a text classification system and method. The technical problem to be solved is: how to solve the problem of deep network measurement layers, many parameters, too long iteration time, and shallow network cannot fully extract effective features.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for text classification
  • System and method for text classification
  • System and method for text classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] Such as figure 1 with image 3 As shown, a text classification system includes an initialization module 1, a first extraction module 2, a second extraction module 3, a comprehensive representation module 4 and a classification module 5;

[0050] The initialization module 1 is used to read the text, vectorize the sentences in the text, and generate a two-dimensional matrix vector;

[0051] The first extraction module 2 is configured to perform convolution and pooling processing on two-dimensional matrix vectors to generate multiple first matrix vectors;

[0052] The second extraction module 3 is configured to perform dot multiplication of a plurality of first matrix vectors with attention matrices, correspondingly generating a plurality of second matrix vectors;

[0053] The comprehensive representation module 4 is used to perform a convolution operation on each matrix vector, so that each second matrix vector is correspondingly converted into a one-dimensional vector ...

Embodiment 2

[0076] Such as image 3 Shown, a kind of text classification method is characterized in that, comprises the following steps:

[0077] Step S1. The initialization module 1 reads the text, vectorizes the sentences in the text, and generates a two-dimensional matrix vector;

[0078] Step S2. The first extraction module 2 performs convolution and pooling processing on two-dimensional matrix vectors to generate a plurality of first matrix vectors; the second extraction module 3 performs point multiplication of the plurality of first matrix vectors with the attention matrix respectively, Correspondingly generate a plurality of second matrix vectors;

[0079] Step S4. The comprehensive representation module 4 performs a convolution operation on each matrix vector, so that each second matrix vector is correspondingly converted into a one-dimensional vector matrix;

[0080] Step S5. The classification module 5 inputs multiple one-dimensional vector matrices into the Fully Contact Lay...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a system and method for text classification. The system comprises the steps that an initialization module reads a text, vectorization is conducted to sentences in the text, and a two-dimensional matrix vector is generated; a first extraction module conducts convolution and pool processing to the two-dimensional matrix vector, and multiple first matrix vectors are generated; the second extraction module conducts dot multiplication between the multiple first matrix vectors and an attention matrix respectively, and multiple second matrix vectors are generated; a comprehensive representation module conducts convolution to each matrix vector, so each second matrix vector is correspondingly converted into a first-dimensional vector matrix; and a classification module inputs the multiple first-dimensional vector matrixes into Fully Contact Layer for processing respectively, inputs output values to a softmax classifier, and the softmax classifier converts matrix values into probability distributions of corresponding classes, so that the text classification is completed. According to the invention, only a few of parameters are used; a network model can be converged rapidly; representation information of a text depth is extracted; and thus, accuracy for the text classification can be increased.

Description

technical field [0001] The invention relates to a text classification system and method. Background technique [0002] With the widespread use of the Internet and mobile terminals, users can easily express their emotions, opinions and comments on the Internet and mobile platforms, resulting in massive text information resources. Therefore, text classification has become very important, and text classification has become more and more important. hot research focus. [0003] In recent years, with the increasing application of CNN (Convolutional Neural Network, convolutional neural network) and attention mechanism (Attention mechanism) in the field of natural language processing, fruitful results have been achieved. The existing technology has deep network measurement layers, many parameters, too long iteration time, and shallow network cannot fully extract effective features. This method uses a shallow CNN network combined with an attention mechanism, which can effectively e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 蔡晓东赵勤鲁
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products