Compression method of neural language model based on tensor decomposition technology

A language model and tensor decomposition technology, applied in the field of neural language model compression, can solve problems such as lack of model parameters and too many model parameters, and achieve the effect of increasing the scale

Pending Publication Date: 2020-04-14
TIANJIN UNIV
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Tensor technology has appeared as a model compression technology. In general, they are used alone in the input layer of the model and the fully connected layer. These compressions have solved the problem of excessive model parameters to a certain extent. However, The work of combining multiple tensor compression methods to compress the internal structure of the model is missing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Compression method of neural language model based on tensor decomposition technology
  • Compression method of neural language model based on tensor decomposition technology
  • Compression method of neural language model based on tensor decomposition technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The technical scheme of the present invention is described in further detail below in conjunction with accompanying drawing, but protection scope of the present invention is not limited to the following description. figure 1 The flow chart of the analysis method of this compressed model proposed by this method is shown; figure 2 A neural network model diagram designed by the present invention is shown; image 3 A schematic diagram showing tensor representations that can reconstruct the original attention.

[0037] The invention discloses a method for compressing neural language model parameters based on tensor decomposition. Due to the outstanding performance of the language model of the self-attention mechanism in natural language processing tasks, the pre-training (Pre-training) language model of the encoder-decoder (Encoder-Decoder) structure based on the self-attention mechanism has become a hot topic in the field of natural language processing. Research hotspots...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a compression method of a neural language model based on a tensor decomposition technology. According to the method, starting from linear representation of an original attention function, it is proved that the attention function can be linearly represented by a set of standard orthogonal basis vectors, and then parameters are compressed by sharing the set of basis vectors under the condition that a multi-head mechanism is constructed; meanwhile, the neural network model can have stronger discrimination capability through modeling in a tensor slicing mode. A new idea isprovided for developing a neural network model with low parameters and high accuracy.

Description

technical field [0001] The present invention relates to the field of neural language model compression, in particular, relates to compressing the original attention function of the Transformer neural network model. Background technique [0002] With the development of artificial intelligence, in the field of natural language processing, the neural language pre-training (Pre-training) model has demonstrated its effectiveness for most tasks. The Transformer model is based on the attention mechanism, replacing the cyclic neural network and the convolutional neural network. Currently, this model has been extensively extended, playing a key role in many other pre-trained language models, for example, the BERT pre-trained model. However, a large number of these pre-trained models are difficult to deploy on limited resources due to the large number of parameters. Therefore, the compression of pre-trained language models is an important research problem. [0003] Several methods ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/126G06F40/58G06N3/08
CPCG06N3/084
Inventor 马鑫典张鹏张帅
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products