Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for building a large-scale cross-domain text sentiment analysis framework

A technology with emotional orientation and emotional orientation, applied in text database clustering/classification, unstructured text data retrieval, data mining, etc., can solve the problem of limiting the classification ability of SCL algorithm, not being able to perfectly align, and difficult to determine the number of tasks, etc. problems, to achieve the effect of strong anti-interference ability, enhanced robustness, and strong anti-interference ability

Active Publication Date: 2019-08-09
河北广潮科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, they all have different limitations in solving the problem of cross-domain text sentiment classification: the SCL algorithm regards the cross-domain problem as a multi-task learning problem, and tries to construct a series of related tasks between central features and non-central features. However, it is difficult to determine a reasonable number of tasks, which limits the classification ability of the SCL algorithm in cross-domain problems; the SFA algorithm uses domain-independent words as a bridge, and uses the co-occurrence matrix to combine the source domain and the target domain. Domain-related words are aligned, and the element values ​​in the matrix are the co-occurrence times of domain-independent words and domain-related words
However, when the frequency of domain-independent words and the number of co-occurrences of domain-independent words and domain-related words are small, some words that are related to each other or similar to a certain extent cannot be perfectly aligned; the FRM algorithm is based on the Under the common feature subspace, a new vector space model is constructed through the feature mapping function, so as to realize the tendency analysis of cross-domain text sentiment, but every time it is applied to a new field, a new mapping function needs to be rebuilt to form a new space model , such an operation is relatively cumbersome

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for building a large-scale cross-domain text sentiment analysis framework
  • A method for building a large-scale cross-domain text sentiment analysis framework
  • A method for building a large-scale cross-domain text sentiment analysis framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0035] The used hardware equipment of the present invention has 1 PC machine; Auxiliary tool has: the Word2vec tool of NLPIR word segmentation system, Google.

[0036] Such as figure 1 As shown, the present invention provides a method for establishing a large-scale cross-domain text sentiment analysis framework, which specifically includes the following steps:

[0037] Step 1. Precisely segment the samples in the source domain and the target domain.

[0038] Step 1.1, get sample files of input source domain and target domain.

[0039] The sample files are all from public data sets used for sentiment analysis in the network, which are comments about movies, commodities, news, etc. with user's emotional tendencies.

[0040] Step 1.2, using the NLPIR word segmentation system to perform word segmentation on the sample...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for establishing a large-scale cross-field text emotion orientation analysis framework. The method comprises the steps that precise word segmentation is carried out on sample documents in a source field and sample documents in a target field to form two word vector tables; word vectors are clustered, and the fields are aligned; primary sentence modeling is carried out on calibration samples in the source field through the word vectors and serves as input of DCELM, and interlayer abstraction features of text vectors are extracted through convolution operation; convolution layer parameters obtained when the classification effect of a verification set is best are recorded and serve as parameters of a DCELM network convolution layer; finally, interlayer abstraction features of calibration samples, extracted through DCNN, of a small number of target fields are used for training implicit layer parameters of a classifier ELM to establish the large-scale cross-field text emotion orientation analysis framework. By means of the technical scheme, the difference of words for expressing emotion polarities among the fields is eliminated on the sample layer, the defects that on a full-connection layer, local optimization is easily caused and the generalization ability is weak are effectively solved, and the anti-interference performance of a model is improved.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to a method for establishing a large-scale cross-domain text sentiment orientation analysis framework based on deep learning. Background technique [0002] As an important tool for people to communicate, natural language undeniably contains emotional investment between communicators. Moreover, with the rapid development of the Internet at present, a large number of comments with subjective opinions on products, movies, news and other things published by users have appeared in the network. By analyzing these subjective text information, it can not only provide decision-making reference for consumers when purchasing products, but also help merchants sell products and determine new market demands. But first of all, these comments with user opinions and emotions are increasing at an exponential rate every day, so it is a very challenging task to analyze only manually; s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35
CPCG06F16/35G06F2216/03
Inventor 贾熹滨靳亚李宁
Owner 河北广潮科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products