Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Hot event aggregating method and apparatus

A hot event and aggregation method technology, applied in the field of information processing, can solve problems such as high cost, inaccurate similarity discrimination, and affecting text similarity

Active Publication Date: 2018-11-16
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF10 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, since TF-IDF does not take into account the influence of the context of the text, it will have some defects in the expression of similarity, and will bring the following disadvantages to the application of aggregation hotspot events:
[0005] 1. Since the calculation of TF-IDF is more dependent on the size and quality of the corpus, the larger the corpus, the better the quality, and the more accurate the calculated TF-IDF is, but it will cost a lot in the process of preparing the corpus
[0006] 2. The calculated TF-IDF is based on the assumption of independence between words, so the corresponding word weights obtained are also independent of each other, but in the actual text, the relationship between words in the text is also close, which directly affects Calculation of subsequent text similarity
[0008] In addition, due to the independence of TF-IDF's own words, when evaluating the similarity between reports and events, there will be a problem of ignoring the emphasis of the text itself, resulting in inaccurate discrimination of similarity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hot event aggregating method and apparatus
  • Hot event aggregating method and apparatus
  • Hot event aggregating method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0073] refer to figure 1 , which shows a flow chart of steps of an embodiment of a method for aggregating hotspot events in the present invention, which may specifically include the following steps:

[0074] Step 101, obtaining an original report based on the title of the hot event;

[0075] In practical applications, the search engine can obtain hot events from the hot search list, and can also dig out hot events from data with a sharp number of query hits. Certainly, hot events can also be determined in other ways. This is not limited.

[0076] In this embodiment of the present invention, there may be one hot event, that is, there may also be one title of the hot event.

[0077] In a preferred embodiment of the present invent...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiments of the invention provide a hot event aggregating method and apparatus. The method comprises the following steps: based on a headline of a hot event, acquiring an original report; basedon the headline of the hot event and the original report, determining a seed report and multiple nonseed reports; generating a hot event cluster by using the seed report; calculating a similarity between each of the nonseed reports and the headline of the hot event, as well as each of reports in the hot event cluster; acquiring a nonseed report having a highest similarity; judging whether a similarity of the nonseed report having the highest similarity is greater than a similarity threshold value or not; and if yes, storing the nonseed report having the highest similarity to the hot event cluster. According to the hot event aggregating method and apparatus provided by the embodiments of the invention, by instructing the similarities between the seed report and the hot event as well as thereports to an aggregation process, an aggregation method surrounds the event in itself more, a similarity of a text is balanced more accurately, and a better aggregation effect is acquired.

Description

technical field [0001] The present invention relates to the technical field of information processing, in particular to a hot event aggregation method and a hot event aggregation device. Background technique [0002] Hot event aggregation is an important basic technology of NLP (natural language processing, natural language processing), which plays an important role in recommendation, search, bubble and other businesses. [0003] According to the aggregation of reports related to hot events, most of them currently use the TF-IDF word weight clustering method to achieve a certain effect on the similarity between related reports. After the text is segmented, TF-IDF is calculated as the weight of the corresponding word. After the word vector is generated, the similarity is calculated according to the cosine distance, and then the corresponding reports are aggregated according to the similarity between the texts through the related clustering algorithm. [0004] However, since ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/258G06F40/289G06F40/30
Inventor 张轩玮
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products