Data mining-oriented text processing system and method

A technology oriented to data and text processing, applied in the fields of electronic digital data processing, special data processing applications, natural language data processing, etc. Sexuality and Expansion Effects

Inactive Publication Date: 2016-01-13
NO 32 RES INST OF CHINA ELECTRONICS TECH GRP
View PDF9 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The Chinese patent publication number is: CN103176953A, the publication date is 2013.06.26, and the patent name is: a text processing method and system, which discloses a text processing method and method for improving the efficiency and accuracy of text processing in the prior art. system, but it only involves limited text processing technologies such as text segmentation, part-of-speech tagging, and entity recognition, which limits the ability to process text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data mining-oriented text processing system and method
  • Data mining-oriented text processing system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056]The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

[0057] The data mining-oriented text processing system provided according to the present invention includes: a text extraction module 102, a text word segmentation module 103, an index establishment module 104, an entity recognition module 105, a keyword extraction module 106, an automatic summary module 107, and an automatic classification module 108 and a service interface module 109;

[0058] - the text extraction module 102 is used to receive an external text file, and when it is judged that th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a data mining-oriented text processing system. The system comprises: a text extraction module, a text segmentation module, an index establishing module, an entity identification module, a keyword extraction module, an automatic summarization module, an automatic classification module and a service interface module. The text segmentation module performs code conversion, conversion between simplified and traditional Chinese, and a part-of-speech tagging operation on a text extracted by the text extraction module. The index establishing module, the entity identification module, the keyword extraction module, the automatic summarization module and the automatic classification module are used for obtaining an index file, an entity word, a keyword, an abstract and a classification result of the text content. The service interface module is used for publishing output results of the index establishing module, the entity identification module, the keyword extraction module, the automatic summarization module and the automatic classification module in the form of a service to other systems for calling. The present invention also provides a data mining-oriented text processing method. The method is capable of providing a more complete text processing capability.

Description

technical field [0001] The invention relates to the technical field of computer information processing, in particular to a data mining-oriented text processing system and method. Background technique [0002] With the rapid development and popularization of network information services and computer technology, a large amount of structured and unstructured data has emerged, especially unstructured data represented by text, etc. People are trying to extract effective, concise and , refined and understandable knowledge. Data mining generally refers to the process of automatically searching for information with special relationships hidden in a large amount of data. Data mining for text data mainly includes index building, entity recognition, keyword extraction, automatic summarization, and automatic classification. Operational procedures, and the realization of these procedures requires text processing. Therefore, a data mining-oriented text processing system needs to solve v...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F40/205G06F40/279
Inventor 陈培华谢彬焦莹
Owner NO 32 RES INST OF CHINA ELECTRONICS TECH GRP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products