Neural network acceleration method and device

A neural network and multiplier technology, applied in the field of neural network acceleration, can solve problems such as inability to process neural networks at the same time, index values ​​cannot be stored efficiently, and achieve the effect of solving the mixed quantization bit width

Active Publication Date: 2021-02-05
TSINGHUA UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Current sparse neural network accelerators cannot handle neural networks with different quantizations simultaneously
In fact, the main technical problem is that the sparse network processing requires the index value of the data, and the neural network storage of different quantization modes will cause the index value to be unable to be stored efficiently.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network acceleration method and device
  • Neural network acceleration method and device
  • Neural network acceleration method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0023] In order to overcome the above-mentioned problems in the prior art, an embodiment of the present invention provides an acceleration method for a neural network. The inventive idea is: use the feature map output by the neural network each time as a three-dimensional characteristic, divide the feature map into blocks, and obtain several block...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the present invention provide a neural network acceleration method and device, wherein the method includes: for any layer in the neural network, according to the quantization mode of the layer, block the feature map input to the layer to obtain several For block data, set the same index value for pixels in different channels but at the same position in the block data; calculate the sparsity of the block data, discard the block data that is all 0, and for the remaining block data, according to the remaining block data The data sparsity and the preset threshold determine the corresponding sparsity type, and perform sparse coding on the remaining block data according to the sparsity type. The implementation of the invention ensures that the number of indexes at each pixel position does not increase exponentially due to different quantization modes, and solves the problem of neural network coding in which multiple sparsities and multiple quantization bit widths are mixed.

Description

technical field [0001] The present invention relates to the technical field of accelerator design, and more specifically, to an acceleration method of a neural network. Background technique [0002] Since the current neural network has an activation function (ReLU), it can cause a large amount of feature data (featuremap) to be sparse, and using methods such as pruning to train the neural network can also cause a large amount of weight data (weight data) to be sparse. Effectively exploiting these sparsities can greatly improve the processing efficiency of neural network accelerators. At the same time, when the hardware processes the neural network, fixed-point processing will bring a great energy boost compared to floating-point processing. Using fixed-point processing for neural networks has become common practice for energy-efficient accelerators. At present, many literatures have focused on the sparsity of neural networks and the utilization of fixed-point quantization....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04
CPCG06N3/04
Inventor 刘勇攀袁哲王靖宇岳金山杨一雄李学清杨华中
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products