Interactive relation labeling and extracting framework capable of being quickly started

A fast-start and relational extraction technology, applied in character and pattern recognition, biological neural network models, structured data retrieval, etc., can solve problems such as heavy labor cost input and high cold start cost

Pending Publication Date: 2022-03-01
SOUTHEAST UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention is aimed at the problems existing in the prior art, and provides a fast-starting interactive relationship labeling and extraction framework. This technical solution proposes an active learning technology that uses manual proofreading information to reduce labeling data and improve model performance. The low-sample relationship extraction technology is used to improve the cold-start performance of the model. Based on the framework disclosed in the present invention, it can effectively overcome the shortcomings of the existing relationship extraction system, such as high cold-start costs and heavy labor costs, and realize fast start-up and low labor cost. Relational Labeling and Extraction System

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Interactive relation labeling and extracting framework capable of being quickly started
  • Interactive relation labeling and extracting framework capable of being quickly started
  • Interactive relation labeling and extracting framework capable of being quickly started

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] Embodiment 1: see Figure 1-Figure 3 , a fast-start interactive relation labeling and extraction framework, including the following steps:

[0049] S1: Pre-train the named entity recognition model using the general named entity recognition dataset;

[0050] S2: Use the general relation extraction dataset to pre-train the few-sample relation extraction model;

[0051] S3: Set the relationship to be extracted and a small amount of labeled data;

[0052] S4: Perform data preprocessing on the text to be extracted;

[0053] S5: Use the named entity recognition model to perform named entity recognition on the text to be extracted;

[0054] S6: Manually pair entities;

[0055] S7: Preliminary relationship extraction is performed on the matching result;

[0056] S8: Perform manual proofreading on the relationship extraction results;

[0057] S9: Fine-tune the few-sample relationship extraction model;

[0058] S10: Repeat S4 to S9 until all the texts to be extracted are p...

specific Embodiment

[0090] Specific examples: refer to figure 1 one image 3 , in this embodiment, the general named entity recognition datasets are MUC-6 and MUC-7 datasets, the general relation extraction dataset is FewRel, and the text segments in the text warehouse to be processed come from Wikipedia. The named entity recognition model is a sequence labeling model based on conditional random fields, and the relationship extraction model adopts the PrototypicalNetwork structure. Among them, the PCNN model is used to encode text sentences and entities. Its structure is as follows image 3 As shown, the GloVe word embedding vector is used as the pre-trained word vector to encode the words in the sentence.

[0091] In this embodiment, a fast-starting interactive relationship labeling and extraction framework provided by the present invention is applied, and its overall framework is as follows figure 2 As shown, it specifically includes the following steps:

[0092] Step 1) use the general nam...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a quick-start interactive relationship labeling and extracting framework. The quick-start interactive relationship labeling and extracting framework comprises the following steps: S1, pre-training a named entity recognition model by using a general named entity recognition data set; s2, pre-training a few-sample relation extraction model by using a general relation extraction data set; s3, setting a to-be-extracted relationship and a small amount of annotated data; s4, performing data preprocessing on the to-be-extracted text; s5, performing named entity recognition on the to-be-extracted text by using the named entity recognition model; s6, carrying out manual pairing on the entities; s7, performing preliminary relation extraction on a pairing result; s8, performing manual proofreading on a relation extraction result; s9, performing fine adjustment on the few-sample relation extraction model; and S10, repeating the steps S4 to S9 until all the texts to be extracted are processed. According to the scheme, the defects of high starting cost and heavy labor cost investment in the prior art are overcome, and relationship labeling and extraction with the characteristics of quick starting and low labor cost are realized.

Description

technical field [0001] The invention relates to a fast-starting interactive relationship labeling and extraction framework based on human-computer interaction, which belongs to the technical field of computer artificial intelligence and natural language processing. Background technique [0002] Relation extraction is an important subtask in the field of information extraction. It plays a key role in many application scenarios such as the construction of knowledge graphs, dialogue systems, and knowledge question answering systems. It also has extensive application value in medical, military, and financial fields. The main goal of relation extraction is to extract the triplet structure of <subject, predicate, object>, or <head, relation, tail> from the text. A common form of relationship extraction is to input a piece of text and the two entities involved, determine whether the text content describes the relationship between the two entities, and infer what kind of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F16/36G06F16/28G06K9/62G06N3/08
CPCG06F40/295G06F16/283G06F16/367G06N3/08G06F18/214
Inventor 李学恺漆桂林
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products