Parameter deducing method, computing device and system based on potential dirichlet model

A Dirichlet model and computing device technology, applied in the field of information retrieval, can solve problems such as poor solution accuracy of the LDA model, and achieve the effect of improving accuracy

Active Publication Date: 2012-05-02
HONOR DEVICE CO LTD
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] Embodiments of the present invention provide a parameter inference method, computing device and system based on a latent Dirichlet model to solve the problem of poor solution accuracy of the LDA model caused by inaccurate numbers of topics input manually

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parameter deducing method, computing device and system based on potential dirichlet model
  • Parameter deducing method, computing device and system based on potential dirichlet model
  • Parameter deducing method, computing device and system based on potential dirichlet model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034]The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0035] In the following embodiments, the "first hyperparameter" refers to the hyperparameter of the "text-topic" distribution, and the "second hyperparameter" number refers to the hyperparameter of the "topic-word" distribution of the "number of topics". By studying the "text-topic" distribution and "topic-word" distribution in the LDA results, we can know the topics that the author of the text is interested in and the proportion of topics covered by each text. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a parameter deducing method, a computing device and a system based on a potential dirichlet model, relating to the information retrieval field, and mainly for solving the problem that the solution accuracy of an LDA model is poor because of the inaccurate number of manual inputted subjects. The method comprises the steps of, according to the set initial first hyper-parameter, initial second hyper-parameter, initial subject number, initial global text, subject counting array and subject and word counting array, computing the LDA model to obtain a probability distribution; utilizing an expectation maximization algorithm to obtain the subject number, the first hyper-parameter and the second hyper-parameter making the logarithm likelihood function of the probability distribution maximal; determining whether the subject number, the first hyper-parameter and the second hyper-parameter are convergences; if yes, bringing the subject number, the first hyper-parameter and the second hyper-parameter in the LDA model to compute, until the optimal subject number, the optimal first hyper-parameter and the optimal second hyper-parameter making the logarithm likelihood function of the probability distribution maximal are convergences. The embodiment of the invention is used in text parameter deduction.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a parameter inference method, computing device and system based on a latent Dirichlet model. Background technique [0002] With the rapid development of the Internet, the information on the Internet is increasing exponentially. Faced with such a massive amount of information resources, how to efficiently and quickly obtain the information they need is becoming more and more important to people. In order to improve the quality and efficiency of users' information retrieval, many powerful information retrieval tools-search engines have appeared one after another. While search engines bring great convenience to people, they also expose many deficiencies in the search technology with keywords as the basic index unit: on the one hand, no matter what keywords users submit, they will return too many results, among which The information that the user really needs often only accounts...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06K9/6218G06F17/30G06N5/048G06F17/30011G06F16/93G06F18/23
Inventor 科比洛夫·维拉迪斯拉维文刘飞施广宇
Owner HONOR DEVICE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products