Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Supervised Online Topic Model Learning Method Based on Sparse Implicit Feature Expression

A topic model and feature expression technology, applied in the field of supervised online topic model learning, can solve problems such as inability to effectively handle large-scale document and streaming document input, and achieve the goal of improving classification accuracy, accuracy and model training speed. Effect

Active Publication Date: 2016-06-15
BEIJING REALAI TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these techniques cannot effectively handle large-scale document and streaming document input, and at the same time effectively control the sparsity of implicit features in topic models.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Supervised Online Topic Model Learning Method Based on Sparse Implicit Feature Expression
  • A Supervised Online Topic Model Learning Method Based on Sparse Implicit Feature Expression
  • A Supervised Online Topic Model Learning Method Based on Sparse Implicit Feature Expression

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention. The online topic model learning method based on sparse implicit feature expression proposed by the present invention is described in detail as follows with reference to the embodiments.

[0031] Such as figure 2 As shown, this embodiment includes the following steps:

[0032] Step 1. The training set contains a total of D documents. Using the online learning method, select M documents from the D documents in the training set, and perform implicit feature extraction based on sparse representation for these documents and each word in the documents. , to get the feature matrix T of M×K, where each row of T represents the feature vector of a document, where K is the dimension...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a supervised online topic model learning method based on sparse implicit characteristic expression, and relates to the field of data mining and machine learning. The method comprises the following steps of: carrying out sparse expression based implicit characteristic extraction on a document in a training set and each word in the document by an online learning method, so as to obtain a plurality of groups of characteristic vectors; training a classifier according to the characteristic vector of the training set and the class information of the document in the training set so as to obtain the characteristic vector of the classifier, wherein each type of the characteristic vector of the classifier corresponds to the type of the document in the training set; extracting the characteristic vectors of all the documents to be recognized; and calculating inner products on the characteristic vectors of the documents to be recognized and the characteristic vector of each type of the classifier, wherein that the maximum value of the inner product corresponds to the type of the training set is considered as the recognition result of the documents to be recognized. According to the supervised online topic model learning method based on the sparse implicit characteristic expression, the speed in model training is greatly increased by adopting the online learning way, and the accuracy rate of classification can be increased by utilizing the supervision information.

Description

technical field [0001] The invention relates to the technical fields of data mining and machine learning, in particular to a supervised online topic model learning method based on sparse implicit feature expression. Background technique [0002] The implicit topic model has shown obvious advantages in mining the semantic information of documents and processing complex document structures. In recent years, using the implicit topic model to efficiently mine the structure of large-scale documents and stream input documents has become a research field in this field. hotspot. [0003] At present, the existing methods of mining the semantic structure of documents using implicit topic models are mainly based on probabilistic models. Among the many models, the representative ones are Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA). The main problems to be solved by using the implicit topic model to mine the semant...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 朱军张傲南张钹
Owner BEIJING REALAI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products