Text feature extraction method, system, and device based on feature encoding

A feature encoding and feature extraction technology, applied in the field of information classification, can solve the problems of high computational complexity, low classification efficiency and accuracy, and achieve the effects of overcoming limitations, reducing feature dimensions, and improving accuracy

Active Publication Date: 2021-06-22
INST OF AUTOMATION CHINESE ACAD OF SCI +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to solve the above-mentioned problems in the prior art, that is, the problems of high computational complexity, low classification efficiency and low precision in text feature extraction, the present invention provides a text feature extraction method based on feature coding, including:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text feature extraction method, system, and device based on feature encoding
  • Text feature extraction method, system, and device based on feature encoding
  • Text feature extraction method, system, and device based on feature encoding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, not to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

[0053] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0054] The present invention provides a text feature extraction method based on feature coding, which is based on a binary text feature coding method and combined with a genetic algorithm to realize the selection of text features, which can effectively overcome the limitations faced ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of information classification, and specifically relates to a text feature extraction method, system, and device based on feature coding, aiming to solve the problems of high computational complexity and low classification efficiency and precision in text feature extraction. The method of the present invention includes: preprocessing the acquired text to obtain word candidate feature sequences; generating a plurality of binary codes based on the word candidate feature sequences; using genetic algorithm to screen binary codes to obtain optimal binary codes; decoding optimal binary codes to obtain The optimal word feature sequence and output. The invention transforms a series of candidate features into easy-to-handle coding sequences, and uses the automatic screening function of the genetic algorithm to maximize the global optimal selection of features, and can effectively screen out the minimum effective feature set.

Description

technical field [0001] The invention belongs to the field of information classification, and in particular relates to a text feature extraction method, system and device based on feature coding. Background technique [0002] With the rapid development and popularization of Internet technology, in the face of the growing mass data, how to make full and effective use of it has become a top priority for major Internet companies and related scientific research institutions. Among these data, text data is the largest category. In the use of text data, classification occupies half of the country, which refers to the process of automatically determining the text category according to the text content under a given classification system. Today's text classification has a very wide range of application scenarios. For example, for a large number of news articles contained in news websites, these articles are automatically classified according to the subject matter based on the conten...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/289G06F40/12G06N3/00G06N3/12
CPCG06N3/006G06N3/126G06F40/12G06F40/289
Inventor 张旭熊彦钧何赛克刘春阳郑晓龙陈志鹏曾大军彭鑫
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products