Tag mining method, device, equipment, storage medium

A label and vector representation technology, applied in data mining, unstructured text data retrieval, special data processing applications, etc., can solve the problems of small number of labels, low accuracy, single corpus, etc., to increase the number of labels and improve accuracy low degree of effect

Active Publication Date: 2022-03-15
BEIJING BYTEDANCE NETWORK TECH CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, there are mainly the following problems: (1) the candidate corpus for label mining is generally a single corpus, and different corpora need to learn different models; (2) the identification of named entities is generally a single-category label recognition; the accuracy is low and the number of labels is small

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tag mining method, device, equipment, storage medium
  • Tag mining method, device, equipment, storage medium
  • Tag mining method, device, equipment, storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] All techniques and scientific terms used herein are identical to those skilled in the art of the present disclosure, unless otherwise defined. The purpose is not intended to limit the disclosure; the specifiers of the present disclosure and claims and the above description, including, and "having", and any variations thereof, intended to cover the contained in which it is included. The description and claims of the present disclosure or the term "first", "second", "second", or the like in the above drawings are used to distinguish the different objects, not to describe a particular order.

[0029] The "Embodiment" mentioned herein means that the specific features, structures, or characteristics described in connection with the embodiments may be included in at least one embodiment of the present disclosure. This phrase is not necessarily a separate or alternative embodiment of the same embodiment in each position in the specification. Those skilled in the art is, and the em...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The disclosure provides a tag mining method, device, equipment, and storage medium. The label mining method includes: generating a first embedding vector representation of the corpus category of the text; generating a second embedding vector representation of the content information of the text; splicing the first embedding vector representation and the second embedding vector representation And import the deep neural network model to generate the third embedding vector representation; cut the third embedding vector representation into a plurality of sub-segments by segment cutting; perform multi-classification processing on the plurality of sub-segments, and mine the label of the text category. The present disclosure can realize that only one model is needed for texts of multiple corpora, and the single-category judgment mining of labels is improved to multi-category judgment mining, which improves the accuracy of labels and the number of labels, and improves user experience.

Description

Technical field [0001] The present disclosure relates to the technical field of computer software, and more particularly to a label mining method, apparatus, device, and storage medium. Background technique [0002] Text information often contains some entities with specific meanings, such as goods, name, location, etc. Label mining is one of the methods of extracting such an entity from textual information. Accurate and effectively label, and have an important role in the text intention to understand, recommended systems, etc. Commonly used label mining methods include: (1) a method based on a dictionary; (2) based on a rule; (3) based on machine learning nomenclature-based naming model, etc.. [0003] Machine-based naming entity extraction model is currently mainstream label mining method, which generally uses a method of marking, and determining which part of the text is a key entity, and judging the category of key entities. [0004] The current main existence is as follows: ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35
CPCG06F16/353G06F2216/03
Inventor 刘乾超杨建东王竞豪周旻平兰枫郝卓琳
Owner BEIJING BYTEDANCE NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products