Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

76 results about "Topic mining" patented technology

System and method for retrieving and presenting concept centric information in social media networks

The embodiments herein provide a system and method for retrieving and presenting concept centric information in social media network that allows a user to search, read, express, and debate on opinions on a particular concept. The system comprises an input module for receiving an input query, a visualization module for visualizing the retrieved concept centric opinions, a topic mining module for mining a semantically related topics, a tweets tracking module for tracking influential tweets, an interesting comments searching module for searching comments posted by interested users from social media networks, public forums, blogs and other community portals, a posted comments counting module for calculating a total number of posts processed, a comments posting module for allowing users to post comments, a BUZZ words display module for displaying context words, and a new posts display module for displaying new post related to the user input query.
Owner:VEOOZ LABS

Personalized webpage recommendation method based on topic and relative entropy

The present invention discloses a personalized webpage recommendation method based on a topic and a relative entropy. According to the method, firstly, an LDA (latent dirichlet allocation) model is adopted to carry out topic mining on webpage content and user reading behaviors and to calculate a webpage semantic feature vector and a user interest feature vector based on the topic; and then a similarity measuring formula based on the concept of the relative entropy is utilized to calculate similarity between a webpage-to-be-recommended semantic feature vector and the user interest feature vector, and the obtained similarity is used as a decision basis for personalized webpage recommendation. According to the personalized webpage recommendation method based on the topic, a great deal of computing cost based on a collaborative filtering method is avoided; and meanwhile, the topic, instead of a keyword, is adopted to represent webpage content, and thus, the recommendation process and the recommendation results can more comprehensively and accurately reflect conceal information and deep semantic features of the webpage content.
Owner:SOUTHEAST UNIV

Topic mining using natural language processing techniques

The disclosed embodiments provide a method, system and apparatus for processing data. During operation, the system obtains a set of content items containing unstructured data. Next, the system obtains a set of part-of-speech (POS) tags for lexical items in the set of content items. The system then uses a computer to match the POS tags to one or more POS tagging patterns to obtain a set of candidate topics for the set of content items and extract a set of topics for the set of content items from the set of candidate topics.
Owner:MICROSOFT TECH LICENSING LLC

Short text topic model mining method based on word network to extend characteristics

A short text topic model mining method based on a word network to extend characteristics comprises a weighted word network construction step, a short text characteristics extending step, and a topic mining step. The weighted word network construction step comprises preprocessing a text, performing Chinese words segmentation on the text in a short text corpus, and deleting stop words; establishing a weighted word network from a document after the Chinese words segmentation is performed, wherein nodes in the weighted word network are words, each edge between the nodes is cooccurrence relation of two words in the same document, and the weight of the edge is the cooccurrence time of the two words in the whole corpus; and ending. The short text characteristics extending step comprises using the word nodes included by each short text after the Chinese words segmentation is performed as a community of the established weighted word network. According to the short text characteristics sparsity solution method based on word network community module degree, the problem that the effect of applying an LDA topic model to the short text is poor is solved. Accuracy of a short text topic model is increased.
Owner:NANJING UNIV

Subtopic mining method

The invention provides a subtopic mining method. The method comprises the steps that (1) a subject value of each term of each document in a corpus is initialized; (2) based on the current subject values of all the terms of all the documents, the probability of each term in each article coming from all subtopics and the probability of each term coming from a background module are calculated, and then a subject value is redistributed for each term in each article through a Gibbs sampling algorithm based on the calculated probabilities, wherein the probability of each term coming from the background module is calculated according to term distribution vectors, subjected to statics in advance, in the background module, and the term distribution vectors in the background module are constant from beginning to end in the iteration process; and (3) if iteration stop conditions are met, LDA subtopics are obtained according to current subject value information, and if not, the step (2) is returned to. Through the method, the topic mining effect targeting a feature article set can be remarkably improved.
Owner:INST OF COMPUTING TECH CHINESE ACAD OF SCI

Low-rank decomposition based delicate topic mining method

The invention discloses a low-rank decomposition based delicate topic mining method. The delicate topic mining method comprises the following steps: conducting word dividing and stopword removal processing on an original corpus text; generating a topic matrix on the basis of a word frequency matrix obtained through pre-processing; decomposing the original corpus text into topic background and keywords by the topic matrix. According to the delicate topic mining method, a delicate model for expressing text contents without introducing a new implicit variable is brought forward; the model adopts an LDA (Latent Dirichlet Allocation) model as the basis to extract topic distribution of a text collection, and introduces in an improvement method of principal component analysis, namely the robustness principal component analysis method, in combination with the characteristics of text topics constituted by different aspects, in order to decompose each topic into a low-rank part and a rarefaction part; the low-rank part represents common words under the topic, and the rarefaction part is the delicate descriptions in different angles under the topic, so that the purpose of delicately expressing a text is realized, and the problems that the conventional topic model can only mine the topic background of the text, and cannot delicately describe emphasis points of the text are effectively solved.
Owner:INST OF ELECTRONICS CHINESE ACAD OF SCI

Topic mining sentiment analysis method based on user feature optimization

ActiveCN109933657AEffective Topic MiningEffective emotion predictionData processing applicationsSpecial data processing applicationsMultiple perspectiveAffective forecasting
The invention belongs to sentiment analysis and theme mining tasks in the field of natural language processing, and particularly relates to a theme mining sentiment analysis method based on user characteristic optimization. The method comprises: S1, establishing a multi-dimensional theme and emotion joint model MTSM based on an LDA theme model, and text information, time, user characteristics andemotion tags are fused in the model; S2, utilizing the training corpus to train a model, and solving model parameters; and S3, utilizing the trained model to carry out topic mining and emotion prediction on the test corpus. The invention aims at the characteristics of a network social text. The method has the advantages that text information, time, user characteristics, emotion tags and other four-dimensional information are effectively integrated, a network social text generation mode is redefined, a multi-dimensional topic emotion score combination type is established, topic information is observed and compared from multiple perspectives, and the emotion prediction accuracy of the network social text is improved.
Owner:SUN YAT SEN UNIV

Online internet topic mining method based on improved LDA model

The invention discloses an online internet topic mining method based on an improved LDA model. The method corresponds to a continuous and streaming type topic mining process conducted in a segmented mode, n web pages are processed each time, the web pages are usually acquired by web crawlers from the internet in an online and real-time mode, and the mining results of the contents of the web pages generate k topics. After the current n topics are processed, the newly acquired n web pages are continuously processed through the mentioned process. The process mainly includes initialization of On-LDA model hyper-parameters, dynamic updating of the On-LDA model hyper-parameters, internet topic mining based on the On-LDA model and the like. By means of the method, the assignment way and use effect in respect to the hyper-parameters and of a traditional LDA model in the topic mining process are radically changed, the classified information to which the web page contents belong is fully utilized to assign initial values to the model hyper-parameters and , the initial values of the hyper-parameters completely depend on the web page contents to be mined, and the computing process is simplified while reasonability is achieved.
Owner:SOUTHEAST UNIV

Cross-domain knowledge discovery-oriented topic mining method

The invention discloses a cross-domain knowledge discovery-oriented topic mining method, comprising: constructing a source domain text set and a target domain set; extracting potential class feature information and potential semantic information from the source domain text set; extracting potential feature information and potential semantic information from the target domain set; automatically aggregating a text in the target domain set into a potential style component; modeling the semantic information of the target domain set in a potential topic component; and modeling the potential topic component of the semantic information of the target domain set. The method has the following advantages: features of texts in a source domain are automatically mined for identifying and classifying texts in a target domain; text feature information of the source domain is accurately transferred in a text cluster of the target domain; and text content, different from the source domain, in the target domain is automatically found out.
Owner:TSINGHUA UNIV

Text theme mining method based on intra-sentence association graph

The invention provides a text theme mining method based on an intra-sentence association graph and relates to the technical field of data mining. The technical problems that an existing mining method is low in quality and poor in universality can be solved by the text theme mining method. The method includes the steps that a target text is firstly divided according to sentences, a sentence sequence table of the text is acquired, then, a sentence association matrix of the target text is established, the weight of each element in the sentence sequence table is calculated, theme sentences are selected according to the calculated weights, the weights of all the non-theme sentences are adjusted each time the theme sentences are selected, theme sentences are selected again according to the adjusted weights, the operation is conducted repeatedly until the sum of character sizes of all the theme sentences reaches a preset character number threshold value, and finally, all the theme sentences serve as the theme content mined from the target text. The method is suitable for text documents of various forms of literature, styles and types.
Owner:SHANGHAI CHUWA SOFTWARE

Document topic mining method and apparatus

The present application proposes a document topic mining method and apparatus. The method comprises: according to a preset topic mining number, performing loop iteration processing on information in at least one received document based on a probabilistic latent semantic analysis model, and acquiring a posteriori estimate of each topic implied by each sentence in each document; according to the posteriori estimate of each topic, acquiring a membership weight of each word in each topic in each sentence; and generating a topic set corresponding to the topic mining number, wherein each topic set comprises a word related to each topic and screened out according to the membership weight of each word in each topic in the sentence. According to the document topic mining method and apparatus provided by the present application, the document topic is more comprehensively and accurately mined based on a PLSA (Probabilistic Latent Semantic Analysis) algorithm, and the correlation of document topic content is improved, thereby enabling a result of a search engine to be closer to semantic information of the document.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Microblog cell division method based on user comprehensive similarities

The invention designs a microblog cell division method based on user comprehensive similarities. According to the specific process of the method, 1, microblog data is acquired, LDA topic model training is performed on a blog article set, and a user topic similarity matrix is obtained through topic mining based on feature extension; 2, a network topological graph with users being nodes and user relations being edges is constructed, and a user comprehensive similarity matrix is obtained according to node link relevancy and topic similarities; and 3, a unique tag is allocated for each node first,the potential influence of each node is evaluated, then the descending order of the potential influences serves as a node selection order, the descending order of node comprehensive similarities serves as a tag update order of the nodes, and finally iterative update of the tags is performed. In this way, cell division can be performed on the microblog users through an improved tag propagation algorithm on the basis of considering the user comprehensive similarities, and the method has high application value for online public opinion monitoring, commercial user mining and the like.
Owner:JIANGSU UNIV +1

Real-time news content-oriented stream topic evolution tracking method

The invention discloses a real-time news content-oriented stream topic evolution tracking method. The method comprises the steps of firstly, batching news contents collected in real time according totime periods, and mining a preliminary topic result for each batch of the news contents by adopting an LDA method; secondly, performing named entity identification in the batch of the news contents, and calculating correlation between topics and entities, thereby updating entity link relationships in an entity library; thirdly, through topic inner lexical item clustering, obtaining a topic-topic inner class cluster corresponding relationship, and storing topic results in a topic library; and finally, calculating popularity information of the topics and topic inner class clusters, and accordingto the popularity information, performing dynamic updating on LDA topic mining parameters for topic evolution tracking of the next batch of the news contents. According to the method, topic featuresand class cluster features of topic inner lexical items in the real-time news contents can be mined; difference among the topics and among different class clusters in the topics is fully utilized; andthe LDA topic mining parameters are dynamically updated.
Owner:SOUTHEAST UNIV

Method and device for generating information.

The embodiments of the invention disclose a method and device for generating information. In one specific embodiment, the method includes: obtaining an article to be mined; using at least two topic mining methods, mining at least two types of topics of the article to be mined, and determining a relevance of the mined topic and the article to be mined; and based on the mined topic and the determined relevance, determining the topic of the article to be mined and the relevance of the article to be mined and the topic. According to the method and device for generating information, the topic of the article to be mined is mined from different dimensions to obtain more comprehensive and more accurate topics.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Short text topic mining method based on semantic word network

The invention discloses a short text topic mining method based on a semantic word network. The short text topic mining method comprises the following steps: 1) a model initialization stage: collectingexternal corpora in related fields, preprocessing the corpora, setting parameters and the like; 2) a theme unit construction stage: constructing a semantic word network, searching a specific word triangular structure, calculating model prior parameters and the like; 3) a model training stage: sampling model variables by using a Gibbs sampling method, and judging whether the model reaches a convergence condition or not; and 4) a result output stage: obtaining topic distribution of each word triangle according to a sampling result of each variable after model training is finished, and calculating topic distribution of the original document. According to the method, semantic information learned by an external corpus is combined with the word triangular theme structure, the method is appliedto short text theme mining, compared with a traditional word pair theme model, the method provides a solution for integrating external priori knowledge into the traditional theme model, and the quality of the mined theme is remarkably improved.
Owner:NANJING UNIV

Abstract generation method based on TMPP (Topic Model based on Phrase Parameter)

The invention discloses an abstract generation method based on a TMPP (Topic Model based on Phrase Parameter). The abstract generation method is characterized in that a parameter [Theta] which shows document-topic in standard LDA (Latent Dirichlet Allocation) is expanded into an (aspect, rating) set, the TMPP is used for simultaneously modeling of the aspect and the rating, and a potential clustering variable c is introduced to show domain priori knowledge to guide the model to generate an aspect with better quality. The TMPP is adopted to generate an (aspect, rating) abstract, topic mining quality is guaranteed, an unsupervised learning way of the LDA is effectively overcome, and the phenomenon of the generation of meaningless topics is avoided.
Owner:SHANGHAI DIANJI UNIV

Biterm topic model (BTM) sampling acceleration method

ActiveCN106776579AOptimizing Sample Time ComplexityOptimize mining timeNatural language data processingSpecial data processing applicationsBiterm topic modelAlgorithm
The invention provides a Biterm topic model (BTM) sampling acceleration method. The method includes: establishing an alias table for each term, and selecting one Biterm topic model; sampling one new topic for the Biterm from a corpus proposal and calculating probability of acceptance; judging whether the probability of acceptance is greater than r or not, if yes, updating the Biterm, or otherwise, performing no updating; sampling another new topic for the Biterm topic model from a word proposal and calculating probability of acceptance; judging whether the probability of acceptance is greater than r or not, if yes, updating the Biterm topic model, or otherwise, performing no updating. With the method, complexity of sampling time of BTM can be optimized, convergence rate of the BTM can be greatly increased, quality of final topic clustering is unaffected, time for essay topic mining can be optimized, and meanwhile, time for text topic mining can be optimized as well.
Owner:TSINGHUA UNIV

Parallel topic mining method and device

An embodiment of the invention provides a parallel topic mining method and device. The method comprises steps as follows: a first node of the parallel topic mining device receives a second word-topic submatrix sent by a second node and a second remainder submatrix, wherein the second remainder submatrix comprises a row whose row accumulated value is the largest in a remainder matrix as well as a column whose column accumulated value is the largest; the second word-topic submatrix comprises a row, corresponding to the row number of the row with the largest row accumulated value in the remainder matrix, in a word-topic matrix as well as a column, corresponding to the column number of the column with the largest column accumulated value in the remainder matrix, in the word-topic matrix; a first word-topic submatrix is updated according to the second word-topic submatrix, a first remainder submatrix is updated according to the second remainder submatrix, and the updated first word-topic submatrix and the updated first remainder submatrix are sent to the second node. Accordingly, the communication capacity in the topic mining process is reduced, and the topic mining speed is increased.
Owner:HONOR DEVICE CO LTD

User text information analysis method and device

The invention provides a user text information analysis method. The method includes: processing to-be-analyzed text information; carrying out potential topic mining on the preprocessed to-be-analyzedtext information, and obtaining topic probability distribution of the text; calculating the similarity of the text according to the topic probability distribution, and performing user characteristic value clustering according to the similarity; performing digital marking on the clustered to-be-analyzed text information to obtain to-be-analyzed sample data; and inputting the to-be-analyzed sample data into a pre-established user preference analysis model to obtain a user preference analysis result. According to the scheme, the text similarity between the users is calculated by deeply mining thetext features of the users, and clustering analysis is performed according to the similarity distance, so that the structure of a hidden layer of the deep neural network is simplified, and the learning efficiency of the deep neural network is improved.
Owner:BEIJING INFORMATION SCI & TECH UNIV

Multi-source network public opinion theme mining method based on improved hierarchical clustering

The invention discloses a multi-source network public opinion theme mining method based on improved hierarchical clustering, and relates to the field of theme mining. The method specifically comprisesthe following steps of 1, obtaining a word vector; 2, preprocessing all the data; 3, vectorizing the total sample data sentences preprocessed in the step 2; 4, carrying out sentence vector semi-supervised hierarchical topic mining; and 5, outputting a tree diagram Dendrogram. According to the method, by utilizing the advantage that the hierarchical clustering algorithm comprises the hierarchicalinformation, and on the basis, carrying out optimization at the aspects of priori knowledge use, model input vectorization, high-quality topic screening and the like, so that finally the method provided by the invention can be effectively applied to the topic mining of multi-source network platform short texts with the wide topics, high text noise and lack of grammar specifications.
Owner:BEIJING UNIV OF POSTS & TELECOMM

Topic mining method and apparatus

The invention provides a topic mining method and apparatus. When an iterative process is performed each time, a target message vector is determined from a message vector according to a residual error of the message vector; a current document-topic matrix and a current word-topic matrix are updated only according to the target message vector; and a target element in a word-document matrix corresponding to the target message vector is only calculated according to the current document-topic matrix and the current word-topic matrix, so that the calculation of all non-zero elements in the word-document matrix in each iterative process is avoided, the update of the current document-topic matrix and the current word-topic matrix according to all message vectors is avoided, the calculation amount is greatly reduced, the topic mining speed is increased, and the topic mining efficiency is improved.
Owner:XFUSION DIGITAL TECH CO LTD

Topic mining based event cluster acquisition method

The present invention discloses a topic mining based event cluster acquisition method. The method comprises the following steps: S1, collecting a text data set C; S2, preprocessing the text data set C, and removing meaningless words from the text data set; S3: setting the number n of topics and a parameter, running CTM to obtain a CTM model; S4, in a covariance matrix sigma indicating an association degree between topics of the CTM model, searching for all maximum clusters by using a backtracking algorithm, wherein the maximum clusters are topic clusters; and S5, for each topic comprised in each topic cluster, selecting a most corresponding article from the text data set C, and clustering events corresponding to the most corresponding article to form an event cluster. The method provided by the present invention has the following advantages: association degree information at the topic level is used during association degree analysis, and compared with the traditional technology in which association degree information is mined and used at the word level, the method provided by the present invention can better improve rationality in calculating an event association degree.
Owner:TSINGHUA UNIV

Method for topic mining and application recommendation of user

The invention provides a method for topic mining and application recommendation of a user. The method comprises the following steps of firstly, quantitatively calculating a weight for weighing a certain application installed by the user on topic contribution of the user, and building a model installed by a user selection application according to the weight; and finally, calculating a parameter of the built model so that the process of the user selection application can be simulated and expected application software is recommended to the user. By the method, the functions of user mining and potential characteristics of the application are achieved according to an installation application list of the user, and the function of recommending an application in which the user is interested to the user is achieved.
Owner:广州展翼信息科技有限公司

Veterinary drug residue knowledge graph construction method based on weighted LDA

The invention discloses a veterinary drug residue knowledge graph construction method based on weighted LDA (Latent Dirichlet Allocation). The method comprises the following steps: firstly, constructing a veterinary drug knowledge framework, and performing deep search and downloading literature by using a web crawler in combination with the knowledge framework; and aiming at topic noise existing in the LDA topic model and a feature word bias problem, performing topic mining by using a weighted LDA method, and downloading veterinary drug related literatures again; completing named entity identification and relationship extraction by using a dictionary-based model; and finally, utilizing the Neo4j graph database to construct a veterinary drug knowledge graph. The veterinary drug residue knowledge graph can be constructed, veterinary drug residue characteristic rules and causes of damage of veterinary drug residues to human bodies can be found out, the quality safety of meat, eggs and milk is guaranteed, and therefore the body health and life safety of people are protected.
Owner:CHINA AGRI UNIV

Scenic spot evaluation knowledge base construction method based on metaphor topic mining

The invention discloses a scenic spot evaluation knowledge base construction method based on metaphor topic mining. The method comprises the steps of using a scenic spot recessive topic mining algorithm to construct a scenic spot recessive multi-topic knowledge base; S2, constructing a metaphor multi-topic knowledge base of the scenic spot by adopting a scenic spot metaphor topic feature mining algorithm; S3, constructing a scenic spot evaluation knowledge base based on the semantic matching calculation model of the scenic spot corpus, and identifying the theme to which the tourist comment data belongs and the emotional tendency corresponding to the theme based on the scenic spot evaluation knowledge base. According to the invention, the scenic spot evaluation knowledge base considering metaphor information is constructed; according to the technical scheme, the fine-grained theme of each comment and the emotional tendency information of the corresponding theme in the internet tourism website can be accurately judged, data support is provided for tourists, the tourists are assisted in making decisions conforming to the preferences of the tourists, scenic spot managers can be assisted in improving scenic spot services, and the network public praise of the scenic spots is improved.
Owner:INST OF REMOTE SENSING & DIGITAL EARTH CHINESE ACADEMY OF SCI

Hybrid theme model construction method for deep learning

The invention relates to the technical field of computer deep learning, and provides a hybrid theme model construction method for deep learning. The method comprises the following steps of S1, preprocessing; S2, representing the text information; S3, supplementing a background information sub-network; and S4, dividing the theme of a full connection layer network, and outputting a label classification probability. According to the invention, the theme of the data of a Huawei cloud platform and an intelligent learning platform is mined, a hybrid theme model HTM based on deep learning is discovered, the required data volume in the field of theme classification is smaller, and the texts of different lengths can be converted effectively via a Bi-LSTM framework to obtain the better migration capability, so that the migration capability of the model is high, the classification error rate is low, and the overall classification effect of the model is good, and the beneficial attempts are made for the theme classification model of deep learning in small sample learning and transfer learning in future.
Owner:ANHUI POLYTECHNIC UNIV MECHANICAL & ELECTRICAL COLLEGE

Text topic mining method and device, computer equipment and storage medium

The invention relates to the technical field of artificial intelligence, and provides a text topic mining method and device, computer equipment and a storage medium. The method comprises the steps: employing a Gaussian kernel function to calculate and obtain a similar matrix based on a plurality of texts; performing spectral clustering on the plurality of texts based on the similar matrix to obtain a plurality of text clusters; extracting theme keywords of each text cluster; calculating the reading frequency of each text in each text cluster, and calculating the reading frequency of the theme keyword of the corresponding text cluster based on the reading frequency of each text; and mining according to the reading times of the topic keyword of each text cluster to obtain a text topic. According to the method, the problem of theme dispersion can be solved, and the mined themes better meet the actual requirements of users.
Owner:PING AN TECH (SHENZHEN) CO LTD

Microblog topic mining method based on dynamic behaviors of heterogeneous social media users

The invention discloses a microblog topic mining method based on heterogeneous social media user dynamic behaviors, which comprises the following steps: constructing an attribute multivariate heterogeneous dialogue network, and mining heterogeneous social context for topic detection; introducing a neighbor-level attention mechanism and an interaction-level attention mechanism to model different influences of different neighbors and different types of interaction modes on topic inference, and learning embedding of a specific view; the representations of the plurality of views are used as inputsof multi-view neural variational reasoning, and complex associations among topic semantics carried by different views are captured, so that topics with better consistency are mined.
Owner:TIANJIN UNIV

Urban functional area identification process based on space-time semantic mining

The invention discloses an urban functional area identification process based on space-time semantic mining, which comprises documents, words, basic functional units, space-time data, a topic model, document topic distribution and unit function distribution, and is characterized in that firstly, hidden functions of an area are tried to discover through the topic model, compared with a text theme mining, the basic function units are equivalent to the documents in a corpus, space-time data in the basic function unit is similar to words in the document, unit function distribution obtained after passing through the topic model is equivalent to document topic distribution, and the used city space-time data is representative Sina microblog position sign-in data. Each piece of sign-in data comprises user information, space coordinates of sign-in positions, publishing time, publishing texts and the like. Dynamic activity modes of people can be reflected from different angles, meanwhile, POIs in a research area are obtained from a Baidu map, and function recognition of the area is achieved.
Owner:武汉市中城事大数据有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products