Movie comment sentiment analysis method based on document vector

A document vector and sentiment analysis technology, applied in semantic analysis, text database clustering/classification, electronic digital data processing, etc., can solve problems affecting the performance of sentiment classification, document vector cannot consider the order of words in comments, etc.

Active Publication Date: 2020-04-28
ZHEJIANG UNIV
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the document vector obtained by this method using the weighted average of word vectors cannot consider the order of words in the comments, so it will affect the performance of sentiment classification
On the other hand, this type of method uses an unsupervised way to train word vectors, so the trained word vectors can only represent the semantic and grammatical information of words, and cannot represent information related to emotions, which will also affect the performance of sentiment classification.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Movie comment sentiment analysis method based on document vector
  • Movie comment sentiment analysis method based on document vector
  • Movie comment sentiment analysis method based on document vector

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments, and the purpose and effects of the present invention will become more apparent.

[0080] figure 1 The flow chart of the method of the present invention is given. The present invention divides the sentiment analysis method of movie reviews based on document vectors into four steps, namely data preprocessing, training improved document vector models, predicting feature vectors of movie reviews, predicting movie review emotional category.

[0081] (1) In step 101, the specific steps of data preprocessing are as follows:

[0082] (1.1) Delete non-text information such as special symbols in comments. Comments may include some symbols that have no meaning for emoticons, so use regular expressions to remove special symbols in comments.

[0083] (1.2) Remove reviews with less than the minimum word count (set to 5) and reviews with missing an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a movie comment sentiment analysis method based on document vectors. The method comprises the following steps: firstly, carrying out data cleaning and preprocessing on movie comments to construct an emotion analysis data set; and training an improved document vector model by using comments of the movie and scores corresponding to the comments; inputting the movie comments into an improved document vector model to obtain word vectors and document vectors of the movie comments, averaging the word vectors corresponding to the movie comments, and splicing the word vectors with the document vectors to generate feature vectors; and finally, performing emotion classification on the movie comments by using the feature vectors generated based on the movie comments and the score training classification model corresponding to the movie comments. By using the improved document vector generation method, the movie comment sentiment classification accuracy is improved.

Description

technical field [0001] The invention belongs to the field of text classification, and in particular relates to a method for sentiment analysis of movie reviews based on document vectors. Background technique [0002] Movie reviews are comments and opinions on movies that users publish after watching movies. On the one hand, movie evaluations are an important basis for users to choose movies, and users usually know the characteristics, advantages and disadvantages of movies by checking movie evaluations. On the other hand, the producers hope to understand the possible problems of the film and the needs of users through the user's evaluation. Producers can improve the film and improve the quality of the film by analyzing the user's evaluation. Sentiment analysis, as an important part of user evaluation analysis, can classify user comments according to emotional polarity, and can count the ratio of positive and negative emotions in the film to have a more intuitive understand...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/242G06F40/279G06F40/30
CPCG06F16/35G06F16/3344
Inventor 夏言杜歆
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products