People interest label extraction method based on social network

A social network and tag extraction technology, applied in the field of tag extraction, can solve the problems of ignoring the representativeness of candidate words, not considering the structure and influence of document text, and achieve the effect of accurate hobbies

Active Publication Date: 2018-08-21
SUZHOU UNIV
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Most of the existing interest tag extraction algorithms use single words as interest tags, while ignoring phrases and unique topic tags in social networks
In addition, the TFIDF algorithm mentioned above only considers the frequency of words in documents and document libraries, but does not consider the text structure of documents.
On the contrary, the TextRank algorithm only considers the role of candidate words in the document structure, but ignores the representation of candidate words in the entire corpus, which is easily affected by meaningless words (such as stop words, etc.)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • People interest label extraction method based on social network
  • People interest label extraction method based on social network
  • People interest label extraction method based on social network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.

[0031] to combine Figure 1 to Figure 3 As shown, the present invention discloses a method for extracting interest tags of people based on social networks, including the following steps: step A: data preprocessing; step B: derivation of candidate tags; step C: extraction of interest tags.

[0032] The step A: data preprocessing, which is used to clean, filter and replace the social network data of the person to form a set including multiple words; the data preprocessing includes case conversion, word segmentation, part-of-speech marking, and deletion in sequence. Stop words, remove slang, remove links, remove emoticons, remove retweets. Wherein, the case conversion includes: unif...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a people interest label extraction method based on a social network. The method includes the steps: A data preprocessing: cleaning, screening and replacing social network dataof people to form a set comprising a plurality of words; B candidate label deriving: sequentially reading in and judging the words in the set to form a candidate label set comprising topic labels, word candidate labels and word group candidate labels; C interest label extraction: determining a candidate label TF value, calculating a candidate label IDF value, performing ordering according to a candidate label TFIDF value, leading out parts of topic labels into the candidate label set, calculating weight values between the candidate labels, calculating scores of the candidate labels, and acquiring an interest label set. The method at least has the advantages that frequency of an interest label in a document library and a document is considered, influence of a document structure on the interest label is further considered, and more accurate effects can be acquired.

Description

technical field [0001] The invention relates to the technical field of tag extraction, in particular to a method for extracting tags of people's interests based on social networks. Background technique [0002] With the rapid development of Internet applications, social networks have an increasing influence on users. People are increasingly relying on social networks for information exchange and sharing, which has brought about an explosive growth of Internet data. At the same time, users' demand for personalization is becoming stronger and stronger, such as recommending users' favorite products, games, music, movies or News and more. Character interest tags are usually used to describe the identity attributes and interest attributes of characters, which are very helpful for character retrieval and recommendation, character behavior analysis, discovery of character hobbies and character portrait models. [0003] Commonly used interest tag extraction technologies include TF...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/9535G06F40/289
Inventor 韩月辉赵雷
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products