Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Text classification method and system

A text classification and text technology, applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc., can solve the problem of the decline in the accuracy of text classification and the inability of convolutional neural networks to consider inter-sentence dependencies 、Recurrent neural network cannot consider text semantic features and other issues

Pending Publication Date: 2020-09-11
CENT SOUTH UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Therefore, for the limitations of a single deep learning model for text feature extraction, such as the convolutional neural network cannot consider inter-sentence dependencies, and the recurrent neural network cannot consider text semantic features, which will lead to a decrease in the accuracy of text classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and system
  • Text classification method and system
  • Text classification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention implements a text classification system based on the text classification method used in the present invention, generates a classification model according to the corpus training collected by itself in the early stage, and then predicts the text to obtain its classification model.

[0023] In order to better understand the content of the present invention, the design scheme and steps of the present invention will be described in conjunction with the accompanying drawings designed by the present invention.

[0024] according to figure 1 , a kind of text classification method that the present invention implements, comprises:

[0025] 101. Perform a preprocessing operation on the training samples;

[0026] In this example operation, when converting text to feature vectors, special symbols and stop words must be removed first to reduce redundant information in the text, which will interfere with the text classification results and waste storage resource...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text classification method and a text classification system. The main content of the method comprises the following steps: firstly, classification data is preprocessed; then,because the computer cannot identify the natural language, the computer can only identify specific numeric symbols; for converting natural language into machine-processable symbols, text representation is carried out by adopting a method of embedding training words into a matrix; according to the method, natural language characters are converted into word vectors, words with similar semantics cankeep high similarity, high-quality phrase features are generated, the trained deep learning model is used for classifying the to-be-classified texts on the basis of the word vectors obtained in the mode, and the categories of the to-be-classified texts are determined.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to classifying texts to be classified according to text semantic features and dependencies between text sentences. Background technique [0002] Text classification is based on the setting of classification rules based on text features to automatically classify texts. Macroscopically speaking, it is to build a mapping relationship between text information and classification categories. For text classification, its main steps are divided into text information preprocessing, text expression, text feature selection, and classifier construction. The most important of these is the construction of text feature selection and classification methods. [0003] Classification algorithms are mainly divided into three categories: unsupervised, semi-supervised, and supervised text classification. The unsupervised text classification method mainly classifies unlabeled text information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06F16/332G06F17/15
CPCG06F17/15G06F16/332G06F16/35
Inventor 时翔蔡丽君
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products