Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Emotion classification method based on part-of-speech combination and feature selection

A sentiment classification and feature selection technology, applied in the field of computer science, can solve the problems of wasting computer resources and time, fragrant garbage, inability to distinguish the semantics of the same words, etc., to speed up data processing, facilitate processing, and optimize data processing methods.

Active Publication Date: 2018-11-23
NANTONG UNIVERSITY
View PDF2 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Although the word vector model trained by traditional Word2vec can reflect the potential semantic relationship between words, there are often some problems when training the model. First, the Word2vec tool cannot directly extract the phrase structure that better reflects the emotional tendency of the text. For example, "unhappy" is divided into "no" and "happy". Word2vec learns the contextual semantics of the words "no" and "happy" during training, and cannot directly learn the vector of the phrase "unhappy".
The second is that it is impossible to distinguish the semantics of the same word under different parts of speech. For example, "Xiao Ming bought a bundle of incense and used it for sacrifices, but the incense I bought this time is too rubbish" and "The rice cooked by Xiao Ming is really fragrant", in the previous sentence "Xiang" in "Xiang" is a noun, which refers to the thin strips made of sawdust mixed with spices used in worshiping ancestors or worshiping gods. It has no emotional color and is a neutral word; Smell is a compliment
[0004] Traditional data storage and processing methods greatly waste computer resources and time
Moreover, due to its step-by-step processing mechanism, the traditional Hadoop cluster limits its performance efficiency, and the I / O overhead for the disk is extremely high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Emotion classification method based on part-of-speech combination and feature selection
  • Emotion classification method based on part-of-speech combination and feature selection
  • Emotion classification method based on part-of-speech combination and feature selection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The technical solution of the present invention will be described in detail below in conjunction with the accompanying drawings. In this embodiment, the microblog comment text is used as input text data.

[0039] Such as figure 1 , the sentiment classification method based on part-of-speech combination and feature selection of the present embodiment carries out active and negative binary classification to text sentiment, comprises the following steps:

[0040] Step 1) Initialize the word-part-of-speech Word2vec model.

[0041] Step 2) Preprocessing the text, and selecting feature words with emotional information from the preprocessed text data based on the sentiment dictionary. The sentiment dictionary of this embodiment is composed of a basic sentiment dictionary, an extended sentiment dictionary and a multi-collocation sentiment dictionary.

[0042] Step 3) Combine each feature word and part of speech to convert the text into a sequence text of "word part of speech...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an emotion classification method based on part-of-speech combination and feature selection. The method comprises the following steps of firstly, initializing word-part-of-speech Word2vec model; secondly, carrying out preprocessing operation on data, and selecting feature words with emotion information from preprocessed data based on an emotion dictionary; thirdly, combiningthe feature words and the part-of-speech of texts, and converting the texts into word and part-of-speech pair sequence texts; fourthly, obtaining vectors of the feature words of the word and part-of-speech pair sequence texts through the word-part-of-speech Word2vec model, and performing addition and averaging on the vectors of the words according to the dimensions for the texts to represent thetexts, thereby obtaining eigenvectors of the texts; and finally, obtaining an emotion classification model by utilizing an SVM classifier. The method has the beneficial effects that the emotion dictionary is used for extracting the feature words, and the feature words with the single emotion information are highlighted; and on the other hand, a phrase structure of emotional tendency is extracted based on phrase structure optimization and word segmentation, and the words and the part-of-speech are combined to solve the problem that one word has multiple meanings.

Description

technical field [0001] The invention relates to the field of computer science, in particular to an emotion classification method based on part-of-speech combination and feature selection. Background technique [0002] With the rapid development of social networking platforms, especially Weibo, a large number of netizens can express their opinions and emotions on social events more conveniently, resulting in a large amount of Weibo comment data, which contains rich opinions and views. Emotional information, how to deeply analyze and mine the emotional tendency of the massive data of Weibo texts has become a hot research direction. Traditional sentiment classification methods only focus on lexical features and syntactic features, ignoring the semantic features between words. [0003] Although the word vector model trained by traditional Word2vec can reflect the potential semantic relationship between words, there are often some problems when training the model. First, the Wor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/289
Inventor 施佺郑亚平邵叶秦王晗周晨璨
Owner NANTONG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products