Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method and apparatus for cross-language emotion analysis based on emoji

A sentiment analysis, cross-language technology, applied in the software field

Active Publication Date: 2019-02-12
PEKING UNIV
View PDF7 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the existing problems in the field of cross-language sentiment analysis technology, the purpose of the present invention is to provide a semi-supervised representation learning framework based on the wide use of emoji to solve the cross-language sentiment analysis method and device

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and apparatus for cross-language emotion analysis based on emoji
  • A method and apparatus for cross-language emotion analysis based on emoji
  • A method and apparatus for cross-language emotion analysis based on emoji

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] Let's take the classic Amazon review cross-language analysis task (https: / / www.uni-weimar.de / en / media / chairs / computer-science-department / webis / data / corpus-webis-cls-10 / ) to Further illustrate and verify the method of the present invention. The task uses English as the source language and Japanese, French, and German as the target languages. For each language, it includes sentiment analysis tasks in the three fields of data, DVD, and music. Because of its representativeness, this task has been used as a benchmark dataset in the field of cross-lingual sentiment analysis in the academic field. In order to verify the method of the present invention on this data set, the training of the model is implemented as follows.

[0058] First, tweets in English, Japanese, French, and German were crawled from Twitter, and preprocessed as follows:

[0059] 1) Remove retweets to ensure that each sentence appears in its original context;

[0060] 2) Remove tweets containing URLs to en...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an emoji-based cross-language emotion analysis method and device. The method includes: 1) creating a word vector based on a large number of untagged texts of a collected source language and a target language; 2) selecting a text containing emoji in an unlabeled text base on a word vector, and establishing an emoji prediction task through that text containing emoji, therebyobtaining a sentence representation model; 3) translate that source language corpus marked with affective polarity into the target language, obtain the document representation of the original text and the translated text by using the sentence representation model, and then training the affective classification model by use the document representation; 4) utilizing the trained emotion classification model to classify the new text of the target language to obtain the emotion polarity. The invention uses the emoji text which is easily climbed on the social platform to realize the cross-languageemotion analysis, and can alleviate the problems of scarce marker resources and unbalanced marker resources in different languages.

Description

technical field [0001] The invention is an emoji-based cross-language emotion analysis method and device, belonging to the field of software technology. Background technique [0002] In recent years, with the development of the Internet, a large number of user-generated texts have emerged on the Internet, such as blogs, microblogs, forum discussions, comments, etc. A large amount of user-generated text has aroused researchers' interest in automatic sentiment analysis. Since the early 2000s, sentiment analysis has become one of the hottest research topics in the field of natural language processing, and has been widely used in research fields such as Web mining, data mining, information retrieval, ubiquitous computing, and human-computer interaction. Researchers' enthusiasm for sentiment analysis is largely due to its high practical application value. Sentiment analysis technology has been applied to many real scenarios such as customer feedback tracking, sales forecasting,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06N3/04
CPCG06N3/045
Inventor 刘譞哲陈震鹏沈晟陆璇马郓黄罡
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products