Subject heading classification model creation method and device and storage medium

A classification model and technology of subject words, applied in the field of data processing, can solve the problems of poor accuracy of subject word classification models, complex creation process of subject word classification models, and high creation cost, so as to reduce creation costs, improve accuracy, and simplify the creation process Effect

Active Publication Date: 2017-11-07
TENCENT TECH (SHENZHEN) CO LTD
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a method for creating a subject classification model, a creation device, and a storage medium that can accurately create a subject classification model, the creation process is simple, and the creation cost is low; The accuracy of the subject classification model in the device and storage medium is poor or the creation process of the subject classification model is complicated and the creation cost is high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subject heading classification model creation method and device and storage medium
  • Subject heading classification model creation method and device and storage medium
  • Subject heading classification model creation method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Referring to the drawings, wherein like reference numerals represent like components, the principles of the present invention are exemplified when implemented in a suitable computing environment. The following description is based on illustrated specific embodiments of the invention, which should not be construed as limiting other specific embodiments of the invention not described in detail herein.

[0032] In the following description, specific embodiments of the present invention are described with reference to steps and symbols for operations performed by one or more computers, unless otherwise stated. Accordingly, it will be understood that the steps and operations, which at times are referred to as being performed by a computer, include manipulation by a computer processing unit of electronic signals representing data in a structured form. This manipulation transforms the data or maintains it at a location in the computer's memory system that can reconfigure or ot...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a subject heading classification model creation method. The method comprises the steps that multiple model training documents are obtained, and label terms of the model training documents are extracted; based on a similarity algorithm, core theme phrases corresponding to the label terms are obtained; based on a map content library, a first model training document collection corresponding to the core theme phrases is obtained; based on a machine learning algorithm, the model training documents are subjected to sort operation; based on the map content library, subject type identification of all the model training documents corresponding to the label terms is obtained, and according to the subject type identification corresponding to the label terms, a second model training document collection corresponding to the label terms is determined; repetitive model training documents in the first model training document collection and the second model training document collection corresponding to the label terms are taken as positive samples, other model training documents in the map content library are taken as negative samples, and a subject heading classification model of the label terms is created. The invention further provides a subject heading classification model creation device and a storage medium.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a creation method, creation device and storage medium of a subject heading classification model. Background technique [0002] In the Internet content distribution system, it is necessary to classify articles through keywords, which refer to words that can represent the main content characteristics of the article, so that users can quickly and conveniently understand the content of the article through the keywords. [0003] Existing article subject words are generally tag words that appear in the article, and the tag word extraction algorithm in the article requires that the tag words of the article must have appeared in the article, which greatly limits the abstraction and generalization ability of the article topic words . For example, the tag word "black technology" may not appear in an article describing a specific black technology, so that the subject word of the article cann...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F18/214
Inventor 孙子荀
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products