Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Convolution operation structure for reducing data migration and power consumption of deep neural network

A deep neural network and convolution operation technology, applied in the field of convolution operation structure, can solve the problems of single mining reusability, not considering weight reusability, poor flexibility and adaptability of PE array storage structure, etc. Versatility, simple control structure, and the effect of reducing dynamic power consumption

Active Publication Date: 2020-06-12
XIAN MICROELECTRONICS TECH INST
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This structure fixes the weights in the registers in the PE, avoiding the delay and power consumption overhead caused by weight access, but this solution requires frequent partial and migration between PEs, and the entire PE array can only Output a convolution result
(2) Partial and fixed system is adopted, which is different from the flow migration of results with fixed weights. This system requires that each convolution operation result is fixed to be output by a PE unit, and its characteristic data can be reused horizontally and vertically adjacent to the PE unit. This system It mainly reduces the power consumption of the multiplication part and the result flowing between different PEs. However, the proposed structure does not achieve the maximum reuse of feature data, and does not consider the reusability of weights at the PE level.
(3) Adopt a row-fixed system, which requires each PE to read the input feature map and a row of data of the convolution kernel. Unlike the previous two systems, which calculate the output feature map data one by one, here each cycle PE will generate output features in parallel. The intermediate results of multiple data, but due to the large difference in the size of the feature maps of different applications, and the architecture requires the complete feature map information to be read in by row, so the storage structure in the PE array is less flexible and adaptable
[0004] The purpose of the above-mentioned different hardware acceleration engine systems is to reduce repeated data access and try to compress the huge power consumption overhead caused by multiplication and accumulation in theoretical calculations. However, they all only mine reusability from factors such as weights and input feature maps. Failed to further improve computing performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolution operation structure for reducing data migration and power consumption of deep neural network
  • Convolution operation structure for reducing data migration and power consumption of deep neural network
  • Convolution operation structure for reducing data migration and power consumption of deep neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030]The present invention provides a convolution operation structure that reduces the data migration and power consumption of deep neural networks, fully analyzes the current mainstream convolution calculation data multiplexing system, and proposes a data multiplexing system that combines elements such as weights and input feature maps. The main purpose of this scheme is to reduce the number of times different elements move between PE arrays in the space and time dimensions, thereby reducing the number of accesses to low-level memory, thereby effectively reducing the dynamic power consumption of global computing. The present invention provides a qualitative description of the calculation process for the proposed method, and at the same time provides a quantitative evaluation formula for the feature data access compression rate after adopting the method of the present invention to verify the effectiveness of the scheme, and finally provides a specific implementation structure a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a convolution operation structure for reducing data migration and power consumption of a deep neural network. The convolution operation structure comprises a multiplier and anadder. The input end of the multiplying unit is connected with the multiplexer MUX1 and the multiplexer MUX2 respectively; the output end of the multiplier and the output end of the multiplexer MUX1 are connected with the input end of the adder through the multiplexer MUX3; the input end of the adder is also connected with the input end of a multiplexer MUX4; wherein the output ends of the multiplexer MUX1, the multiplexer MUX2, the multiplier, the multiplexer MUX3, the multiplexer MUX4 and the adder are respectively connected with the register reg1, the register reg2, the adder and the register reg3; the output end of the adder is connected with the register reg2, and the output end of the register reg2 is connected with the input end of the multiplexer MUX4 and used for achieving multiply-accumulate operation of convolution operation. The convolution operation structure is suitable for all current convolutional neural network models, effectively reduces the dynamic power consumptionof global calculation on the premise of meeting the data parallelism degree to the maximum extent, is simple in control structure, and is very high in universality.

Description

technical field [0001] The invention belongs to the technical field of integrated circuit design and special hardware accelerator, and in particular relates to a convolution operation structure for reducing data migration and power consumption of a deep neural network. Background technique [0002] In recent years, with the important breakthroughs in speech and image recognition, deep neural networks (DNNs) have become the basis of many modern artificial intelligence applications. The superior performance of DNNs benefits from its ability to extract high-level features from large amounts of data to obtain effective representations of similar data. A common form of DNNs is Convolutional Neural Networks (CNNs), the main part of which is composed of multiple convolutional layers, each of which is a higher-dimensional abstraction of the input feature map (ifmap). Currently, CNNs The number of convolutional layers has evolved from the original 2-layer LeNet to the current 53-lay...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/063G06N3/04
CPCG06N3/063G06N3/045Y02D10/00
Inventor 娄冕苏若皓杨靓崔媛媛张海金郭娜娜刘思源黄九余田超
Owner XIAN MICROELECTRONICS TECH INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products