Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Syntax dependency relationship-based named entity identification method

A named entity recognition and dependency relationship technology, applied in the field of deep learning, can solve problems such as low accuracy rate, incorrect boundary judgment, boundary recognition is more difficult than type recognition, etc.

Pending Publication Date: 2020-10-16
BEIJING UNIV OF TECH
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In most test samples, false positives (FP) and false negatives (FN) are often caused by incorrect boundary judgments of entities, which means boundary recognition is much more difficult than type recognition
However, most deep network models do not have specific functions for boundary recognition, which makes the model often have a high accuracy rate in type judgment, but a low accuracy rate in boundary judgment.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Syntax dependency relationship-based named entity identification method
  • Syntax dependency relationship-based named entity identification method
  • Syntax dependency relationship-based named entity identification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] Such as figure 1 As shown, Embodiment 1 provides a named entity recognition method based on syntactic dependencies, including the following steps:

[0044] Step S1, in the model training phase, first use the pre-trained Word2vec to map the one-hot word vector to the defined low-dimensional space, and obtain the word vector of each word;

[0045] Step S2, use bidirectional long-short-term memory network (Bi-LSTM) to encode the word vector of each time step in the sentence forward and backward respectively, and concatenate to obtain global features with context information;

[0046] Step S3, using syntax analysis technology to obtain the syntax dependency tree of each sentence, and calculating the shortest dependency path between two words on the tree;

[0047] Step S4, obtain the top-down and bottom-up feature sequences of each word according to the shortest dependency path and input them into the LSTM network, and calculate the local features of the word;

[0048] Ste...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a named entity identification method based on syntax dependency. In named entity identification, only when entity boundaries and types identified by a model are matched with boundaries and types of labeled entities, the entity is calculated to be a true positive example (TP). In most test samples, false positive examples (FP) and false negative examples (FN) are often caused by incorrect boundary judgment of entities, that is, boundary identification is much more difficult than type identification. According to the method, a self-attention mechanism is used for weakening the relation between entities and words outside the entities, and the relation between the words inside the entities is enhanced. Specifically, a self-attention mechanism is added after a bi-directional long-short term memory (Bi-LSTM) network, dependency relationships among words in a syntactic dependency tree are encoded into context information, and finally entity boundaries are judged jointly according to global features provided by the Bi-LSTM network and local features provided by the syntactic dependency tree. According to the method, the accuracy of named entity identification is improved.

Description

Technical field: [0001] The invention relates to the field of deep learning, and relates to a named entity recognition technology in text. Background technique [0002] Traditional named entity recognition methods rely on a large number of human-defined features. However, this method of manually defining features is not only time-consuming and labor-intensive, but also requires professionals with domain and language knowledge. In recent years, relying on its powerful data mining capabilities, deep learning has minimized the cost of manually constructing features, and has made remarkable achievements in the fields of image classification, speech recognition, and natural language processing. Therefore, using deep learning methods for named entity recognition has great research significance. [0003] In text, accurate identification of named entity types and their entity boundaries has a great impact on the development of complex natural language systems, such as information e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06K9/62G06N3/04G06N3/08
CPCG06F40/295G06N3/049G06N3/08G06N3/045G06F18/253
Inventor 李建强刘雅琦白骏
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products