Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Named Entity Recognition Method Based on Pre-trained Language Model

A named entity recognition and language model technology, applied in character and pattern recognition, neural learning methods, natural language data processing, etc., can solve problems such as poor effect, high labeling cost, poor model generalization ability, etc., and achieve fast , Improve the effect of accuracy, recall and F1 value

Active Publication Date: 2022-05-27
NAT UNIV OF DEFENSE TECH
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The named entity recognition algorithm based on deep learning can extract the deep semantic and grammatical features in the text, and use the invariance of these deep features to improve the recognition rate, but this often needs to be supported by a large amount of labeled data. The cost is high, and it is unrealistic to manually obtain a large amount of labeled data. In the case of a lack of samples, the features learned by the deep learning model are often only applicable to the training data. The generalization ability of the model is poor, and the effect on other data is not good.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Named Entity Recognition Method Based on Pre-trained Language Model
  • A Named Entity Recognition Method Based on Pre-trained Language Model
  • A Named Entity Recognition Method Based on Pre-trained Language Model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] figure 2 is the overall flow chart of the present invention; such as figure 2 As shown, the present invention comprises the following steps:

[0080] Step 1: Build a named entity recognition system. Named entity recognition systems such as figure 1 As shown, it consists of a multi-model recognition module, a multi-level fusion module, a discrimination module, an entity label aligner, and an unlabeled database.

[0081] The unlabeled database D stores the text collection obtained from the Internet and other channels, including E texts, where E is a positive integer and 1≤E≤7000, and is connected to the multi-model recognition module and the discrimination module. The unlabeled database D is read by the multi-model recognition module and the discriminant module. D={D 1 , D 2 ,...,D e ,...,D E }, D e represents the e-th text in the untagged database; where N is a positive integer, D e represents text of length N (in characters, D e The length is N and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a named entity recognition method based on a pre-trained language model, with the purpose of improving the accuracy rate, recall rate and F1 value of named entity recognition, and realizing the requirement of recognizing named entities under the condition of few samples. The technical solution is to first build a named entity recognition system consisting of a multi-model recognition module, a multi-level fusion module, a discrimination module, an entity label aligner and an unlabeled database, use the initially trained model to mark the unlabeled data, and use the multi-model recognition The method of multi-level fusion improves the effect of automatic labeling, and uses the SVM classifier to filter the automatic labeling data, the original training set and the filtered automatic labeling data to train the model again, and finally use the trained named entity recognition system to target The text is subjected to multi-model recognition, multi-level fusion and entity label alignment to obtain the entities in the target text. By adopting the present invention, the accuracy rate, recall rate and F1 value of entity recognition in a scene with few samples can be improved.

Description

technical field [0001] The invention relates to the field of natural language processing named entity recognition, in particular to a method for recognizing named entities in a text sequence based on a pre-trained language model. Background technique [0002] Natural language is the main tool for human communication and thinking, and it is the essential feature that distinguishes human beings from other animals. Various intelligences of human beings are closely related to language, and words are the tool to record language. Human logical thinking takes the form of language. The vast majority of human knowledge is also recorded and passed down in the form of language. A large number of words in the text can express rich semantic information and characteristic content, and help people understand the information the text wants to convey. In the era of global intelligence and informatization, the extraction and processing technology of information in natural language has always...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/126G06F40/295G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F16/3344G06F16/35G06F40/126G06F40/295G06F40/30G06N3/08G06N3/044G06F18/2411G06F18/214
Inventor 黄震陈一凡汪昌健郭敏李东升王博阳王安坤徐皮克
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products