Method and device for model training and named entity recognition

A named entity and model training technology, applied in neural learning methods, biological neural network models, instruments, etc.

Active Publication Date: 2020-09-29
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This sparsity of training data brings great challenges to model training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for model training and named entity recognition
  • Method and device for model training and named entity recognition
  • Method and device for model training and named entity recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] The solutions provided in this specification will be described below in conjunction with the accompanying drawings.

[0080] figure 1 A schematic diagram of an implementation scenario of an embodiment disclosed in this specification. Among them, the sequence of word segmentation that will contain multiple word segmentation Input the first recurrent neural network, the first recurrent neural network can output the hidden vector of each word , based on each hidden vector, the distribution probability of each word segment in each category can be determined, and based on these distribution probabilities, the classification result of each word segment is obtained, that is, the label of which category each word segment corresponds to. Categories can be represented by labels. SOS is the start symbol of the word segmentation sequence, and EOS is the end symbol of the word segmentation sequence.

[0081] Named entities (Entity), also known as entity words, have the nature ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of this specification provides a method and device for model training and named entity recognition. During model training, the first named entity in the first sample sequence is replaced with the first preset character to obtain the second sample sequence, and a text segment containing the first preset character is determined from the second sample sequence; using The first recurrent neural network recursively determines the hidden vectors of multiple word segments in the second sample sequence, and determines the representation vector of the text segment; through the variational autoencoder, constructs a Gaussian distribution based on the representation vector and determines the global hidden vector for the text segment Vector; using the first recurrent neural network, using the global hidden vector as the initial hidden vector, recursively determine the decoding latent vector of the word segmentation in the text segment, and determine the predicted value of the word segment in the text segment; based on the difference between the word segment in the text segment and its predicted value As well as the distribution difference, the prediction loss value is determined, and the first recurrent neural network and the variational autoencoder are updated in the direction of reducing the prediction loss value.

Description

technical field [0001] One or more embodiments of this specification relate to the technical field of natural language processing, and in particular to methods and devices for model training and named entity recognition. Background technique [0002] In the field of natural language processing technology, the classification of named entities (Entity) in text sequences is an important direction of research. Named entities have the nature of nouns in the part of speech, including person names, organization names, place names, and all other entity categories identified by names. Broader named entities also include categories such as numbers, dates, currencies, addresses, and more. Accurate recognition of the categories of named entities can improve the accuracy and effectiveness of natural language processing. [0003] Usually, a training set is used to train a model for recognizing named entities, and after the model is trained, a test set is used to test the model. A major...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/284G06F40/295G06N3/04G06N3/08
CPCG06F40/284G06F40/295G06N3/049G06N3/084G06N3/044G06N3/045
Inventor 李扬名李小龙姚开盛
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products