Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Content big data-oriented small sample relation extraction method and device

A technology of relationship extraction and small samples, which is applied in digital data processing, natural language data processing, character and pattern recognition, etc., can solve the problem of efficiently extracting multi-dimensional features of text, ignoring the differences of different types of entity relationships, and time-consuming labor costs etc.

Pending Publication Date: 2021-09-10
北京华成智云软件股份有限公司
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Entity relationship extraction methods usually need to be carried out on the basis of entity labeling, but entity labeling is often time-consuming, labor-intensive and costly, so it is of great practical significance to study entity relationship extraction technology in small sample scenarios
[0003] In recent years, small-sample relationship extraction based on neural network models has become the mainstream method, but the existing small-sample relationship extraction methods consider fewer dimensions when extracting sentence features, and it is difficult to efficiently extract multi-dimensional features of text
At the same time, when using sentence features for relation extraction, existing small-sample relation extraction methods often only focus on the similarity between sentence features, while ignoring the differences between different types of entity relations.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Content big data-oriented small sample relation extraction method and device
  • Content big data-oriented small sample relation extraction method and device
  • Content big data-oriented small sample relation extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The technical solutions provided by the present invention will be described in detail below in conjunction with specific examples. It should be understood that the following specific embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention.

[0024] The small sample relationship extraction method for content-oriented big data disclosed in the embodiment of the present invention, its realization model is as follows figure 1 As shown, the specific implementation steps are as follows:

[0025] Step 1, carry out the vectorization of the fusion character information on the sentence. This embodiment uses the basic version of the pre-trained language model BERT to vectorize the sentence, and the calculation is shown in formula (1), where [CLS] represents the classification feature vector that can represent the context of the sentence, and [SEP] represents the sentence word vector, sentenct=[w 1 ,w 2 ,...,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a content big data-oriented small sample relation extraction method and device. The method comprises the following steps: firstly, performing vectorization processing on sentences by utilizing a pre-training language model and a character-level LSTM; secondly, extracting text structure features through a double-affine mechanism and a graph neural network; simultaneously using [CLS] word vectors representing semantic information in word-level LSTM and BERT to integrate context information into word vectors of the entity pair, and constructing entity pair features containing context information; and finally, extracting the similarity and difference of sentence features through a similarity-difference relation network to perform small sample relation extraction. According to the method and device, small sample learning is applied in a content big data scene, text features are fully extracted, differences among different types of entity relations are described, and the accuracy of relation extraction is improved.

Description

technical field [0001] The invention relates to a small-sample relationship extraction method and device for content-oriented big data, belonging to the technical field of the Internet and big data. Background technique [0002] With the continuous development of the Internet industry, the text data in the Internet continues to grow at an exponential rate. The text data in the Internet has the characteristics of heterogeneity, fragmentation and multi-source, and it contains rich knowledge and information at the same time. Using the entity relationship extraction method can extract structured information representation from unstructured text data. Structured information representation plays a fundamental role in the construction of applications such as knowledge graphs, search engines, and intelligent question answering systems. In addition, only by rationally and efficiently organizing and managing the extracted structured information can we fully mine and utilize the inter...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F40/284G06F40/216G06K9/62G06N3/04
CPCG06F40/295G06F40/284G06F40/216G06N3/044G06N3/045G06F18/22
Inventor 杨鹏娄健程昌虎张磊宏
Owner 北京华成智云软件股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products