Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text classification method based on semi-supervised transfer learning

A text classification and transfer learning technology, applied in machine learning, instrument, character and pattern recognition, etc., can solve problems such as a large amount of data and insufficient labeled data, and achieve the effect of improving performance

Pending Publication Date: 2021-12-17
CHINA THREE GORGES UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Deep learning is an important method in the field of natural language processing. It can achieve good performance in tasks such as text classification, entity recognition, machine translation, and sentiment analysis. However, deep learning has a fundamental weakness: it requires a large amount of labeled data to work properly.
However, in many fields in the real world, there is often the difficulty of insufficient labeled data, and labeling a large amount of unlabeled data is a very expensive task.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method based on semi-supervised transfer learning
  • Text classification method based on semi-supervised transfer learning
  • Text classification method based on semi-supervised transfer learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front end", "rear end", "both ends", "one end", "another end" The orientation or positional relationship indicated by etc. is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the referred device or element must have a specific orientation, use a specific Azimuth configuration and operation, therefore, should not be construed as limiting the invention. In addition, the terms "first" and "second" are used for descriptive purposes only, and should not be understood as indicating or implying relative importance.

[0022] In the description of the present invention, it should be noted that, unless otherwise specified and limited, the terms "installed", "set with", "connected", etc. should be under...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text classification method based on semi-supervised transfer learning. The method comprises the following steps of: (1) data set and data preprocessing: acquiring a small amount of marked data sets and a large amount of unmarked data sets, cleaning and denoising, vectorizing data set samples through a word2vec method, and selecting 100 vector dimensions; (2) data enhancement: performing K times of text enhancement on each sample in the unmarked data in an anti-translation mode; (3) pseudo label pre-judging: inputting a labeled sample into a pre-training model Bert, and carrying out model migration by adopting a fine tuning method; (4) mixing samples; and (5) text classification: using the trained best model for carrying out text classification prediction on data in a test set. According to the text classification method based on semi-supervised transfer learning, semi-supervised learning is combined, transfer learning is used for solving the problem that annotation data are difficult to obtain in the field of text classification, and meanwhile the performance of a text classification model can be improved.

Description

technical field [0001] The invention relates to the technical field of text classification, in particular to a text classification method based on semi-supervised transfer learning. Background technique [0002] Deep learning is an important method in the field of natural language processing. It can achieve good performance in tasks such as text classification, entity recognition, machine translation, and sentiment analysis. However, deep learning has a fundamental weakness: it requires a large amount of labeled data to work properly. . However, in many fields in the real world, there is often the difficulty of insufficient labeled data, and labeling a large amount of unlabeled data is a very expensive task. Therefore, how to reduce the demand of deep learning models for the amount of labeled sample data while improving the predictive performance of the model is an important research content. Contents of the invention [0003] The purpose of the present invention is to p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62G06N20/00
CPCG06N20/00G06F18/2155G06F18/24
Inventor 余肖生张合欢沈胜
Owner CHINA THREE GORGES UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products