Text classification method and device, medium and electronic equipment

A technology for text classification and target text, applied in text database clustering/classification, neural learning methods, unstructured text data retrieval, etc.

Active Publication Date: 2021-09-24
PING AN TECH (SHENZHEN) CO LTD
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the size and parameters of the pre-training model are very large. Once the sample data of the target task is small, directly applying this type of pre-training model will often lead to the establishment of a wrong connection between a large amount of task-irrelevant information and the target label. Overfitting phenomenon makes it difficult to learn effective information only through fine-tuning
At the same time, due to the fact that manual labeling of data is very expensive and time-consuming in real application scenarios, the amount of sample data for many downstream tasks is very limited, which limits the popularization and application of pre-trained models.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device, medium and electronic equipment
  • Text classification method and device, medium and electronic equipment
  • Text classification method and device, medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0033] Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus repeated descriptions thereof will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of natural language processing, and discloses a text classification method and device, a medium and electronic equipment. The method comprises: obtaining target text data; inputting the target text data into a pre-trained text classification model; outputting compressed sentence representation information corresponding to the target text data and an expected value corresponding to the compressed sentence representation information through a variational information bottleneck processing layer; outputting classification prediction information through a classification module according to the compressed sentence representation information received from the variational information bottleneck processing layer; and generating and outputting a classification label corresponding to the target text data through a classification label generation layer according to the classification prediction information received from a classification module and an expected value corresponding to the compressed sentence representation information received from the variational information bottleneck processing layer. According to the method, the occurrence of an over-fitting phenomenon is reduced, and the popularization and application range of the pre-training model is expanded.

Description

technical field [0001] The present disclosure relates to the technical field of natural language processing, and in particular to a text classification method, device, medium and electronic equipment. Background technique [0002] Currently, pre-trained models are widely used in downstream tasks and have achieved good results. However, the size and parameters of the pre-training model are very large. Once the sample data of the target task is small, directly applying this type of pre-training model will often lead to the establishment of a wrong connection between a large amount of task-irrelevant information and the target label. The phenomenon of overfitting makes it difficult to learn effective information only through fine-tuning. At the same time, since manual labeling of data is very expensive and time-consuming in real-world application scenarios, the amount of sample data for many downstream tasks is very limited, which limits the popularization and application of p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06K9/62G06N3/04G06N3/08
CPCG06F16/353G06N3/04G06N3/084G06F18/2414Y02D10/00
Inventor 司世景王健宗
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products