Text classification method based on deep multi-task learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-task learning and text classification technology, applied in the field of natural language processing, can solve the problems of insufficient training data and the decline of test set generalization ability, and achieve the effect of solving insufficient training data and improving performance

Inactive Publication Date: 2017-05-31

SUN YAT SEN UNIV

View PDF4 Cites 26 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The neural network parameters are large in scale and the training data is small, so the problem is that it is easy to overfit, and the generalization ability on the test set is reduced.

There are many methods to improve the overfitting problem, such as parameter regularization, batch normalization, etc., but it does not essentially solve the problem of insufficient training data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0038] like Figure 1-2 As shown, a text classification method based on deep multi-task learning includes the following steps:

[0039] S1: Use word vectors and bidirectional recurrent networks to learn the document representation of the current task;

[0040] S2: Extract features from document representations of other tasks using convolutional neural networks;

[0041] S3: Learn a classifier using the document representation of the current task and the features of other tasks.

[0042] The specific process of step S1 is:

[0043] Divide all Chinese documents in all tasks into word segmentation, assuming that there are N words in total, and then assign each word a unique label, and then represent it as a K-dimensional vector, that is, all word vectors travel an N*K matrix, and then use positive The state distribution is randomly initialized, and the word vector matrix is shared by all tasks;

[0044] The document representation of the current task is learned with word ve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a text classification method based on deep multi-task leaning. The method comprises the steps that by means of a recurrent neural network obtained through other task training and by combining the learning ability of a convolutional neural network, additional document representation is obtained, that is to say, a large amount of external information is introduced, semantic representation of a document is extended, and the problem that training data is insufficient is effectively solved. Accordingly, compared with a traditional multi-task leaning method, the convolutional neural network is used for conducting feature extraction on bottom-layer features of an auxiliary task, the features of other tasks can be utilized for being effectively transferred to the current task, and the performance of text classification is improved.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, and more particularly, to a text classification method based on deep multi-task learning. Background technique [0002] With the development of the Internet, there are more and more demands for tasks such as topic identification, spam identification, and sentiment analysis, which are all based on text classification. The goal of text classification is to give some documents and their corresponding class labels as a training set, and learn a classifier through an algorithm that can predict the class labels of documents without labels in the test set. [0003] There are many text classification algorithms based on deep neural networks, including recurrent neural networks, convolutional neural networks, recurrent convolutional neural networks, and the combination of these networks with attention mechanisms, memory modules, etc. These neural networks have achieved good re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30G06N3/08

CPCG06F16/35G06N3/08

Inventor 张梓滨潘嵘

Owner SUN YAT SEN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Text classification method based on deep multi-task learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology