Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Weak supervision relation extraction method based on multi-source semantic representation fusion

A technology of semantic representation and relation extraction, applied in the field of information extraction, which can solve the problems of single semantic feature representation, uneven sample distribution, and insufficient supervision.

Active Publication Date: 2020-10-02
DALIAN UNIV OF TECH
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in order to save labor costs, weakly supervised signals are often insufficiently supervised, which makes the relationship extraction method face problems such as wrong labels and uneven sample distribution during the training process.
Most of the existing relationship extraction algorithms based on weakly supervised learning focus on the embedded information of the original corpus content to alleviate the impact of insufficient supervision resources. They lack the integration and full utilization of semantic information at different levels, and the expression of semantic features is relatively simple, which can easily lead to serious extraction results. Prefer relationship types with more training samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Weak supervision relation extraction method based on multi-source semantic representation fusion
  • Weak supervision relation extraction method based on multi-source semantic representation fusion
  • Weak supervision relation extraction method based on multi-source semantic representation fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0013] Table 1 is the first-order logical constraint set for general relation extraction defined in this embodiment. As shown in Table 1, this embodiment uses the logical constraint declaration grammar provided by Stanford University as a basis to define the association between text characteristics and sample instances for symbolic representation. And some lexical and syntactic features are selected as the supervision source of relation extraction. For example, a grammatical dependency tree composed of word blocks, directions, and dependencies between two entities can be selected as a syntactic feature. Named entity recognition tags corresponding to two entities, word sequences and part-of-speech tags between or on both sides of the entities are used as lexical features.

[0014] figure 1 Given the architecture design of the relation extractio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a weak supervision relation extraction method based on multi-source semantic representation fusion. Firstly, distributed word vectors are adopted to initialize context semanticfeatures of text statements, a natural language processing tool is adopted to analyze mass discretized symbol features describing the text features, and a universal first-order logic rule between statement instances and features in a relation extraction task is designed; then, a logic rule is combined with the factor graph to establish a relationship between the text characteristics and the statement instances, modeling is performed from the perspective of human perception through joint statistical reasoning, and a low-dimensional relationship semantic vector for describing the text characteristics is learned; and the semantic information of the statement content encoded by the bidirectional gating loop unit is used as a context content semantic vector. And finally, text characteristic semantic vectors are finely adjusted in the neural network, vector representations of two different characteristic sources are fused to obtain text semantic characteristic representations with higher robustness, and weak supervision relationship extraction work is guided together with entity pair embedded representations.

Description

technical field [0001] The present invention belongs to the technical field of information extraction, and is suitable for relation extraction in the general field, and in particular relates to the extraction of the "entity-relationship-entity" triplet relation in a sentence under the condition that the training samples corresponding to weak supervision are inaccurate and unbalanced, specifically A Weakly Supervised Relation Extraction Method Based on Multi-source Semantic Representation Fusion. Background technique [0002] In real life, unstructured text information is like dark matter, buried in massive network data, and difficult to process due to lack of structure. In-depth study of entity relationship extraction technology is to use the computer's ability to efficiently process text to extract a unified format of relational fact representations from massive, unstructured network texts. By mining the semantic information of the target entity in the text sentence, it pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/36G06F16/35G06F40/211G06F40/253G06F40/30G06N5/00G06N3/04
CPCG06F16/367G06F16/353G06F40/211G06F40/253G06F40/30G06N5/01G06N3/045
Inventor 刘宇倪骏单世民赵哲焕徐秀娟刘日升王恺
Owner DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products