A method for extracting specific text information based on layer classifier and template matching

An information extraction and template matching technology, which is applied in unstructured text data retrieval, text database clustering/classification, etc., can solve problems such as text type misclassification, text syntactic connection loss, and reduce the effect of information extraction, so as to improve accuracy Sexuality, the effect of succinct information

Inactive Publication Date: 2019-02-15
SOUTHEAST UNIV
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the text types increase rapidly, the classification effect of ordinary classifiers will continue to decrease as the number of types increases, resulting in a large number of misclassifications of text types
Secondly, after the type is determined, a rule-based method is often used to extract different information points, but purely rule-based extraction will lead to a large loss of connections between text syntax, thereby reducing the information extraction effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for extracting specific text information based on layer classifier and template matching
  • A method for extracting specific text information based on layer classifier and template matching
  • A method for extracting specific text information based on layer classifier and template matching

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings.

[0027] The present invention builds a specific text information extraction model based on layer classifier and template matching for the text information extraction problem. What the model mainly extracts is the event information of the text. The event information includes time, place, number of casualties (or event theme), The classification module and extraction module of the text are included in the model construction process. In the classification module, text and trigger words are used as input, and the specific type of text is determined through the auxiliary discrimination of the layer classifier and trigger words, and the hyperparameters of the layer classifier are set based on the characteristics of the text, including the penalty parameter of the error term, the kernel Functions, classifier layers, text vectors, etc. In the extracti...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for extracting specific information by using a layer classifier and a template matching. The method constructs a set of information extraction models according to a mass of text data by using a layer classifier model, an information extraction template, a semantic tree and other technologies. In the process of constructing the model, firstly, the layer classifier and the trigger-based method are used to determine the type of the text, then the information extraction template shared by all the text can be used, and the corresponding information extraction template can be selected according to the type of the text, and finally the specific information can be extracted from the text data through the template. Through the invention, the specific information canbe extracted quickly and accurately from the massive text.

Description

technical field [0001] The invention relates to a text information extraction method, in particular to a specific text information extraction method based on layer classifiers and template matching. Background technique [0002] In order to extract specific information from massive text data, the original text data can be extracted through text information extraction technology, and then the specific information points extracted from the text can be integrated in a unified form. A successful information extraction system can convert massive unstructured text data into structured information data, and can also store the converted data into the database. This is of great significance for text analysis, public opinion monitoring, Internet knowledge acquisition and other fields. [0003] The specific information extraction system needs to adopt different templates for different text types. First, the text needs to be classified. However, when the text types increase rapidly, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35
Inventor 吴含前袁烽
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products