Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method of extracting hot topics based on keywords

A technology of hot topics and keywords, applied in the fields of instruments, text database query, calculation, etc., can solve the problems of obvious delay, poor user experience, high time complexity, and achieve ideal effects and improve efficiency.

Active Publication Date: 2020-05-08
成都云数未来信息科学有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this detection and tracking technology itself is aimed at the small number of documents. If it is faced with massive Internet information, it is difficult to meet the practical application needs of detecting hot topics in such a large and continuous information flow by using traditional topic detection technology. Even if it can be detected, the time complexity is very high, the delay is very obvious, and the user's energy is very limited, it is impossible to obtain useful knowledge on related topics by reading all documents. Therefore, the user experience is very poor, and users often hope It can promptly and quickly understand the events or topics currently being discussed by netizens, so the detection speed of hot topics has been further improved, not only in terms of time, but also in terms of quantity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method of extracting hot topics based on keywords
  • A method of extracting hot topics based on keywords
  • A method of extracting hot topics based on keywords

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0045] figure 1 It is a flow chart of the method for extracting hot topics based on keywords in the present invention.

[0046] In this example, if figure 1 As shown, a method for extracting hot topics based on keywords of the present invention comprises the following steps:

[0047] S1. Use crawlers to crawl major news websites, such as Sina, Baidu, Tencent..., crawl 100 news text data sets A of the day, and then unify these text data into txt text format and store them in the database;

[0048] A=['The driving school service and fees are not disclosed and opaque;... For those whose driver's license has been revoked without submitting a medical certificate, their driving qualifications can be restored after passing the medical examination according to the regulations. ','The police went to the hospital to take pictures of the old man and change his ID card. "Elderly people... We also took the equipment to the house to reissue their documents. For this special situation, th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for extracting hot topics based on keywords. According to the method, mass data is subjected to format unification and word segmentation processing to form a corpus; the corpus is subjected to parallel block processing to obtain a candidate word set of each block; the candidate word set of each block is subjected to TFIDF empowerment and de-weighting processing to obtain a reference document; cosine similarity processing is performed on a reference text and other texts in the blocks, the texts similar to the reference text are extracted, and a plurality of hot subjects of the similar texts are found according to a word frequency descending order for a candidate keyword set in the similar texts; and last, the hot topics are extracted from the hot subjects, wherein the hot topics can better represent main viewpoints of the mass data.

Description

technical field [0001] The invention belongs to the technical field of network public opinion monitoring, and more specifically relates to a method for extracting hot topics based on keywords. Background technique [0002] With the vigorous development of Internet technology and the rapid popularization of related applications, everyone is no longer just a consumer of information, but also a producer of information. Netizens can use computers, mobile phones and other network terminals to post on Weibo and social media anytime, anywhere. , news, blogs and other websites to obtain or publish information, and many existing business portals will collect and provide rich news reports for users, such as Sina, Netease, etc. However, the content of the reports is generally determined by the news Editors manually write, with a certain degree of subjectivity, and the amount of news is very large. If you refer to the reports of multiple portal websites, it is difficult to have a clear ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/335G06F16/33
CPCG06F16/3334G06F16/3346G06F16/335
Inventor 陆川孙健杨伟
Owner 成都云数未来信息科学有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products