Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method and system for mining topic context based on massive search logs

A topic and log technology, applied in the field of mining the development context of a given topic, can solve the problems of incompatibility, long cycle and high labor cost

Active Publication Date: 2016-08-10
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the manual editing and labeling method needs to manually label each news document, and then use the machine to summarize and display the edited and marked documents. The coverage of topic information mined in this way is narrow, and the labor cost is high, which is not suitable for the context of massive news events. The demand for mining; and the event tracking method is to associate the hot topics that occurred in this stage with the hot topics that occurred in the previous stage. If there is a historical topic that can be associated with the current topic, the current topic is a progress of the historical topic. However, topic association often causes topic drift, and this method tracks the latest progress of the topic instead of focusing on the key progress of the topic, so the topic progress data mined is not clear about the topic context. In addition, due to the need to integrate all current topics Correlation and matching with all historical topics, so the post-development cost of this method is relatively large and the cycle is long

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for mining topic context based on massive search logs
  • Method and system for mining topic context based on massive search logs
  • Method and system for mining topic context based on massive search logs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0086] For each key time node, the device cuts the search words in the first statistical record of the key event node, weights the cut words according to their relevance to the topic, and selects a weight value exceeding a predetermined threshold The words are used as the description information of the key time node.

Embodiment 2

[0088] For each key time node, the device cuts the search words in the first statistical record of the key event node, weights the cut words according to their relevance to the topic, and selects a weight value exceeding a predetermined threshold , use the selected words to query matching articles from the included news database or library, and select at least one article from the queried articles as the event article of the key time node.

[0089] Due to the huge resource consumption caused by full mining at cold start, in order to solve this problem, according to another preferred embodiment of the present invention, the device further stores the first search word statistical data and the second search word statistical data. In this way, in addition to taking a lot of time to fully mine the historical log data when the system starts for the first time, it can effectively avoid repeated mining and calculation of the historical log data at each subsequent startup, reducing the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a topic venation digging method and system based on massive searching logs. The method includes: counting the web searching logs to produce statistic data of a first searching word; counting the statistic data of the first searching word into statistic data of a second searching word; extracting key words of a first statistic record; counting total searching number of the keywords to obtain total searching statistic data of the keywords; calculating a searching heat value of the keywords in unit time; determining a comprehensive searching heat value of a topic in unit time; determining key time nodes of the topic. By the method and the system, topic shift caused by topic correlation can be avoided effectively, and a clear and complete topic venation can be dug out.

Description

technical field [0001] The present application relates to a method and system for mining topic context based on massive search logs, in particular to a technology for mining the development context of a given topic by analyzing massive web search logs. Background technique [0002] With the promotion and application of the Internet and mobile terminals, browsing news on the Internet has become the most common way of leisure for netizens. According to statistics from Tencent Technology: 61.67% of mobile phone users surf the Internet mainly to browse news. When these users browse the news, they often click to browse some hot topics, and these hot topics are usually composed of several topics. Any topic has a process of generation, development, climax, and end. Topics at important moments in the whole process are connected together to form a topic context. Therefore, how to mine topic context from massive historical topic information has become an important part of understandin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 沈剑平彭学政罗嵘吴波
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products