Method and device for generating label sequence of observation character strings

A technology of sequence generation and character strings, which is applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of limited use occasions, difficult to use, slow emission matrix, etc., to reduce the number of matching times and improve the speed Effect

Active Publication Date: 2015-03-25
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, there are at least the following problems in the prior art: the number of matching feature strings and the number of weight additions in the prior art are large, resulting in a very slow calculation of the emission matrix, especially when the length of the longest feature string increases When , the speed of generating the label sequence will drop sharply, and the time-consuming is proportional to the cube of the length of the longest feature string
The speed severely limits the use cases of this product, making it difficult to use in many products

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating label sequence of observation character strings
  • Method and device for generating label sequence of observation character strings
  • Method and device for generating label sequence of observation character strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0011] The present invention proposes a new technology for generating an annotation sequence of observed character strings. By performing weight pre-addition processing on the pre-trained second feature annotation model, a pre-addition including multiple feature strings and their annotations is obtained. Weights for the first feature annotation model. For example, the pre-addition processing may include: according to the second feature labeling model, respectively combining the weights of the labels of the feature strings whose feature strings are suffixes of other long feature strings in the second feature labeling model with the weights of the long feature strings The weights of the corresponding annotations are added to obtain the pre-added weights of the annotations of the long feature string to generate the first feature annotation model. After the user inputs at least one observation string, for any observation string input by the user, find out the longest feature strin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for generating a label sequence of observation character strings. The method comprises the steps that at least one observation character string input by a user is received; an emitting matrix is generated according to the number of the observation character strings and the number of labels, and the value of each line of the emitting matrix and the value of each row of the emitting matrix are initiated into zero; any observation character string is found out from a pre-trained first characteristic label model to observe the longest characteristic character string at the end of the observation character strings, a pre-added weight corresponding to the longest characteristic character string is added to the values of the rows, corresponding to the observation character string, in the emitting matrix, and the first characteristic label model comprises a plurality of characteristic character strings and labeled pre-added weights of the characteristic character strings; the label sequence of the at least one observation character string is generated according to the emitting matrix with the weights added and a pre-trained transfer matrix. The speed for generating the label sequence of the observation character strings is improved.0.

Description

technical field [0001] The invention relates to natural language processing technology, in particular to a method and device for generating a tagging sequence of an observed character string. Background technique [0002] Generating annotation sequences according to a given observation sequence is the main work when the sequence annotation model is used in actual products. Sequence labeling is a type of machine learning model, that is, given an observation sequence, automatically label each element in the sequence to obtain a label sequence. Sequence tagging can be used for natural language processing tasks such as word segmentation, part-of-speech tagging, and nomenclature recognition. These tasks are the basic components of products such as search engines and machine translation, and have extensive practical application value. However, the above-mentioned speed of generating annotation sequences is usually relatively slow, which is one of the main obstacles that limit the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/00
Inventor 张开旭石磊詹金波
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products