Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Forum post feature identifying method and device

A feature recognition and posting technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as poor robustness, large manpower, and influence, and achieve high feature recognition and high accuracy.

Active Publication Date: 2015-05-27
XIAMEN MEET YOU INFORMATION TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, the clustering effect is limited, and the subsequent classification and extraction consumes a lot of manpower
Even if there is already a classification label set, using IDF to cluster and identify new words or rare words has poor robustness, and the extraction of post feature vectors will be greatly affected

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Forum post feature identifying method and device
  • Forum post feature identifying method and device
  • Forum post feature identifying method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0024] see figure 1 , is a flow chart of a first embodiment of a forum post feature recognition method of the present invention, the method comprising:

[0025] Step S10: the server obtains the title and content of the post.

[0026] Users log in to information publishing platforms such as forums run by servers to publish posts, and the published posts usually include titles and content. Moreover, the published post also includes the identity information ID of the publisher, for example, the user name, the user's network address, and the like.

[0027] Further, the posts obtained by the server may be multiple posts issued by one user, or posts issued by multiple users, that is, the server can obtain a large number of posts.

[0028] Step S11, perform word segmentation on the title and content of the post to calculate the word frequency of each word...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a forum post feature identifying method and a forum post feature identifying device. The method comprises the following steps that a server acquires a title and content of a post; words of the title and the content of the post are separated to be calculated to obtain the word frequency of each word; the obtained word frequencies through calculation are sequenced according to a descending sequence to obtain the words corresponding to first N word frequencies; the words serve as feature words of the post, and N is a natural number greater than 0; correlation coefficients of the feature words and label words in a label library are calculated, and the greatest correlation coefficient is determined, wherein a plurality of label words used for representing post features are pre-stored in the label library; the label word corresponding to the greatest correlation coefficient serves as a label of the post. According to the method and the device provided by the invention, the features of the post can be identified, and relatively high-accuracy feature identification can be realized in a large number of posts.

Description

technical field [0001] The invention relates to the technical field of network information analysis and data mining, in particular to a method and device for feature recognition of forum posts. Background technique [0002] With the continuous development of computer networks, network information has become an important part of daily life, and the Internet has become an important place for people to obtain information and communicate. A large amount of real-time information floods the Internet, and these massive Web information resources contain huge potential value. [0003] Faced with the exponential growth of information, how to effectively grasp massive data, extract hot topics, or obtain the information you want has become a long-term problem that has plagued Internet users. Currently, post content recognition is mainly based on the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm, which calculates the TF value and IDF value of the vocabulary, and then perf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F17/30
Inventor 陈方毅高家栋苏利祥
Owner XIAMEN MEET YOU INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products