Network public opinion topic feature extraction method and system

A technology of feature extraction and network public opinion, applied in the field of data analysis, can solve the problems of small topic relationship, inaccurate description of keyword weights, affecting accuracy, etc., to achieve the effect of simple method

Inactive Publication Date: 2021-06-08
SOUTH CHINA NORMAL UNIVERSITY
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] First of all, the existing public opinion topic feature extraction methods do not distinguish between general public opinion and large-scale public opinion. There are huge differences in the time span, secondary events, and corpus size of the two kinds of public opinion.
Secondly, in the process of topic identification and extraction, the existing methods have the following problems. The clustering-based method is random and will introduce interference information to affect the accuracy; the topic model-based method needs to determine the number of topics in advance, but large-scale network Public opinion spans a large span and there are many topics, and the topic model method will cause the problem of missing topics; although the method based on the co-word network can present a scientific cognitive structure, there are still areas for improvement in specific steps. For example, most methods use word frequency Or subjective judgment to extract keywords, this keyword extraction method is subjective and the description of keyword weight is not accurate enough
Thirdly, the current spatio-temporal feature discovery method for topics is not suitable for data without address labels, and the existing methods reflect the hot spots of posting or comments, rather than the concerns of netizens.
Finally, the existing topic evolution detection method uses a co-word network method with a threshold. Through the setting of the threshold, some topics are filtered, so topic relationships with less relevance cannot be retained.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network public opinion topic feature extraction method and system
  • Network public opinion topic feature extraction method and system
  • Network public opinion topic feature extraction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0054] See figure 1 , the embodiment of the present invention provides a method for extracting network public opinion topic features, comprising the steps of:

[0055] S1. Using word frequency combined with ITF / PDF method to extract keywords from the text corpus to be tested.

[0056] Further, step S1 specifically includes:

[0057] S110. Preprocessing the text corpus to be tested; wherein, the preprocessing includes word segmentation processing, part-of-speech taggi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a network public opinion topic feature extraction method and system, which firstly consider the difference between general network public opinions and large-scale network public opinions, the method is suitable for topic feature extraction of the large-scale network public opinions; secondly, a method based on a co-word network is improved, a method of combining word frequency with ITF / PDF is adopted during keyword extraction, and the weights of the keywords can be described more accurately; thirdly, the non-threshold inter-stage evolution network method adopted by the invention is simple, can retain subtle association between topics, and accords with topic evolution logic; finally, the invention provides an event-driven topic spatio-temporal feature discovery method, and address tags of the text corpora are replaced by address tags of events, so that the situation that the text corpora have no address tags can be matched.

Description

technical field [0001] The invention relates to the technical field of data analysis, in particular to a method and system for extracting network public opinion topic features. Background technique [0002] At present, public opinion topic feature extraction is mainly divided into two steps, the first is topic identification and extraction, and the second is topic feature discovery. For the first step of topic identification and extraction, currently, commonly used methods mainly include clustering-based methods, topic model-based methods, and co-word network-based methods; for the second step of topic feature discovery, the content can include two points. One is spatio-temporal features, and the other is evolutionary features. The current topic spatio-temporal feature discovery method is mainly based on the text corpus with address tags, and the address tags are counted and spatio-temporal mapping; the method usually used in the topic evolution feature detection is based on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/216G06F40/284G06Q50/00
CPCG06Q50/01G06F40/216G06F40/284
Inventor 李卫红刘国庆刘熠孟杨孝锐郭云健张可文
Owner SOUTH CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products