Information extraction method and system based on joint training model

A technology for information extraction and training models, applied in the fields of natural language processing and deep learning, can solve problems such as error transmission, large manpower and time, and low model flexibility, and achieve the effect of improving accuracy

Active Publication Date: 2020-04-07
SICHUAN CHANGHONG ELECTRIC CO LTD
View PDF8 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The purpose of the present invention is to provide an information extraction method and system based on a joint training model, which can solve the problems in the existing information extraction technology that consume a lot of manpower and time, the flexibility of the model is not high, error transmission and incomplete information extraction question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information extraction method and system based on joint training model
  • Information extraction method and system based on joint training model
  • Information extraction method and system based on joint training model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The technical solution of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0035] The information extraction method based on joint training described in the present invention, its flow chart can be found in figure 1 , where the method includes:

[0036] Step 1. Label the corpus to obtain the training corpus containing the label information.

[0037] The method of labeling the corpus includes: adopting a non-manual labeling method, performing remote labeling in a non-supervised manner, and obtaining labeled training corpus.

[0038] Step 2. Sampling the training corpus.

[0039] The sampling method of the training corpus includes: randomly sampling the entities and entity relationships in each text, the specific method is to randomly sample the head entity, and then match all the tail entities and relationship information associated with the head entity.

[0040] The sampling method of the training corpus also in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an information extraction method and system based on a joint training model, and belongs to the technical field of natural language processing and deep learning, in order to solve the problems that in an existing information extraction technology, a large amount of manpower and time are consumed, and the flexibility of a model is not high, and error transmission is caused, and information extraction is incomplete. The information extraction method comprises the steps: labeling corpora, and obtaining training corpora containing labeling information; sampling the trainingcorpus; converting each character in the sampled corpus into a word vector; inputting the word vector into two deep learning models based on different neural networks for joint training, and iteratively updating neural network parameters of a joint model to obtain a trained information extraction joint model; and inputting a to-be-extracted text into the information extraction joint model, and extracting triple information containing a head entity, a tail entity and an entity relationship.

Description

technical field [0001] The invention relates to the technical fields of natural language processing and deep learning, in particular to an information extraction method and system based on a joint training model. Background technique [0002] With the rapid development of information technology and the continuous upgrading of hardware equipment, there is an increasing demand for using massive data to extract corresponding information from text through deep learning models, and it is applied in various scenarios. Information extraction is to extract structured information from unstructured text. Usually, information extraction tasks are mainly divided into two sub-tasks: entity extraction and relationship extraction. Commonly used methods include rule-based methods and machine learning-based methods. and deep learning-based methods. [0003] The early information extraction tasks were mainly based on rules and statistics. This method can be divided into two stages: one is to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F40/289G06N3/04G06N3/08
CPCG06F16/313G06N3/08G06N3/048
Inventor 饶璐孙锐
Owner SICHUAN CHANGHONG ELECTRIC CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products