LSTM model optimization method, accelerator, device and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An optimization method and model technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as difficulty in deploying models, high power consumption of computing platforms, and inability to carry LSTMs to enhance overall performance and applicability range, improved computing efficiency and speed, and the effect of facilitating hardware deployment

Pending Publication Date: 2021-10-22

SHENZHEN ECHIEV AUTONOMOUS DRIVING TECH CO LTD

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Traditional computing platforms cannot carry such a large amount of data calculation of LSTM

In embedded applications, especially in areas with extremely high latency requirements such as autonomous driving, the LSTM model itself has a huge amount of parameters, as well as a large amount of training data and inference test data, resulting in not only computing The complexity is high, and the power consumption of the computing platform is also very large

It is difficult to deploy models in embedded devices with extremely high power consumption requirements

[0003] For the hardware-accelerated calculation of the sparse LSTM model, the industry introduced the Delta algorithm to construct and mine the sparsity of the sequence data by using the numerical similarity of the sequence data, and reconstruct the LSTM model and accelerate the algorithm. However, this method Limited by the time dependence of sequence data, the input data at adjacent times needs to have a high similarity, and the scope of application has obvious limitations, and the implementation of hardware acceleration for Delta-based LSTM model reconstruction is complicated, which is not conducive to hardware deployment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

no. 1 example

[0051] refer to figure 1 , figure 1 For the first embodiment of the LSTM model optimization method of the present application, the method includes:

[0052] Step S110: Obtain the weight matrix of the pruned LSTM network.

[0053]Specifically, the Recurrent Neural Network (RNN) based on Long Short-Term Memory (LSTM) is a neural network model for processing sequence data, which effectively solves the problem of gradient disappearance and explosion, and It is widely used in the field of intelligent cognition, such as speech recognition, behavior recognition and natural language processing.

[0054] Specifically, the pruning operation is to compress the LSTM network to reduce the storage and computing costs of the LSTM network. The methods for compressing LSTM networks mainly include but are not limited to parameter pruning and sharing, low-rank factorization, transferred / compact convolutional filters, and Knowledge distillation.

[0055] Specifically, the weight matrix is ...

no. 2 example

[0090] refer to Image 6 , Image 6 For the second embodiment of the LSTM model optimization method of the present application, the method also includes:

[0091] Step S210: Obtain the weight matrix of the pruned LSTM network.

[0092] Step S220: Obtain the weight sparsity of the weight matrix based on the weight matrix.

[0093] Step S230: Obtain the sparsity of the input sequence.

[0094] Step S240: If the weight sparsity is greater than or equal to the weight sparsity threshold and / or the input sequence sparsity is greater than or equal to the input sequence sparsity threshold, determine the sparse operation mode.

[0095] Step S250: Calculate the input sequence according to the sparse operation mode.

[0096] Step S260: If the weight sparsity is less than the weight sparsity threshold and / or the input sequence sparsity is less than the input sequence sparsity threshold, determine that it is an intensive computing mode.

[0097]Step S270: Calculate the input sequence ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an LSTM model optimization method, an accelerator, a device and a medium. The method comprises the following steps: acquiring a weight matrix of a pruned LSTM network; based on the weight matrix, obtaining the weight sparseness of the weight matrix; obtaining the sparseness of an input sequence; based on the weight sparseness and the sparseness of the input sequence, judging an operation mode; and if the operation mode is judged to be a sparse operation mode, calculating the input sequence according to the sparse operation mode. The invention aims to improve the energy efficiency ratio of the LSTM hardware accelerator, and is simple to implement and easy to deploy.

Description

technical field [0001] The invention relates to the field of computer hardware acceleration, in particular to an LSTM model optimization method, accelerator, device and medium. Background technique [0002] Recurrent Neural Network (RNN) based on Long Short-Term Memory (LSTM) is a neural network model for processing sequence data. It is widely used in cognitive fields, such as speech recognition, behavior recognition and natural language processing. But in the actual engineering application practice, it faces many problems. Traditional computing platforms cannot carry such a large amount of data calculation of LSTM. In embedded applications, especially in areas with extremely high latency requirements such as autonomous driving, the LSTM model itself has a huge amount of parameters, as well as a large amount of training data and inference test data, resulting in not only computing The complexity is high, and the power consumption of the computing platform is also very lar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/063G06N3/08

CPCG06N3/08G06N3/063G06N3/044

Inventor 宋朝忠李小莲连帅军

Owner SHENZHEN ECHIEV AUTONOMOUS DRIVING TECH CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

LSTM model optimization method, accelerator, device and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

no. 1 example

no. 2 example

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology