The invention relates to a multiclass emotion analyzing method and a
system facing a bilingual microblog text and belongs to the technical field of microblog text emotion analysis. The method comprises the following steps that (1)
bilingual dictionary construction: corpus with an emotion inclination of a certain size is first collected, high frequent words with the emotion inclination can be extracted from the corpus, an emotional dictionary is then expanded by using an existing knowledge
database and a vocabulary similarity calculating model, and finally network language and emotional signs are added in the emotional dictionary; (2) text pretreatment: the words are divided in a to-be-identified text, stop words are removed, and
standardization treatment is conducted on English word shapes; (3) text
characteristic space expression: the bilingual emotional dictionary is used for conducting vectorization on the text; (4) an emotional identifying task of the corpus text is realized through a multi emotion
class model. The accurate rate and the F1 valve of the method are higher than those of a traditional classification method, and particularly the classification effect of a semi-supervised
Gaussian mixture model classification
algorithm in a small-scale
training set is obviously better than that of the other methods.