Convolution operation structure for reducing data migration and power consumption of deep neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep neural network and convolution operation technology, applied in the field of convolution operation structure, can solve the problems of single mining reusability, not considering weight reusability, poor flexibility and adaptability of PE array storage structure, etc. Versatility, simple control structure, and the effect of reducing dynamic power consumption

Active Publication Date: 2020-06-12

XIAN MICROELECTRONICS TECH INST

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This structure fixes the weights in the registers in the PE, avoiding the delay and power consumption overhead caused by weight access, but this solution requires frequent partial and migration between PEs, and the entire PE array can only Output a convolution result

(2) Partial and fixed system is adopted, which is different from the flow migration of results with fixed weights. This system requires that each convolution operation result is fixed to be output by a PE unit, and its characteristic data can be reused horizontally and vertically adjacent to the PE unit. This system It mainly reduces the power consumption of the multiplication part and the result flowing between different PEs. However, the proposed structure does not achieve the maximum reuse of feature data, and does not consider the reusability of weights at the PE level.

(3) Adopt a row-fixed system, which requires each PE to read the input feature map and a row of data of the convolution kernel. Unlike the previous two systems, which calculate the output feature map data one by one, here each cycle PE will generate output features in parallel. The intermediate results of multiple data, but due to the large difference in the size of the feature maps of different applications, and the architecture requires the complete feature map information to be read in by row, so the storage structure in the PE array is less flexible and adaptable

[0004] The purpose of the above-mentioned different hardware acceleration engine systems is to reduce repeated data access and try to compress the huge power consumption overhead caused by multiplication and accumulation in theoretical calculations. However, they all only mine reusability from factors such as weights and input feature maps. Failed to further improve computing performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030]The present invention provides a convolution operation structure that reduces the data migration and power consumption of deep neural networks, fully analyzes the current mainstream convolution calculation data multiplexing system, and proposes a data multiplexing system that combines elements such as weights and input feature maps. The main purpose of this scheme is to reduce the number of times different elements move between PE arrays in the space and time dimensions, thereby reducing the number of accesses to low-level memory, thereby effectively reducing the dynamic power consumption of global computing. The present invention provides a qualitative description of the calculation process for the proposed method, and at the same time provides a quantitative evaluation formula for the feature data access compression rate after adopting the method of the present invention to verify the effectiveness of the scheme, and finally provides a specific implementation structure a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a convolution operation structure for reducing data migration and power consumption of a deep neural network. The convolution operation structure comprises a multiplier and anadder. The input end of the multiplying unit is connected with the multiplexer MUX1 and the multiplexer MUX2 respectively; the output end of the multiplier and the output end of the multiplexer MUX1 are connected with the input end of the adder through the multiplexer MUX3; the input end of the adder is also connected with the input end of a multiplexer MUX4; wherein the output ends of the multiplexer MUX1, the multiplexer MUX2, the multiplier, the multiplexer MUX3, the multiplexer MUX4 and the adder are respectively connected with the register reg1, the register reg2, the adder and the register reg3; the output end of the adder is connected with the register reg2, and the output end of the register reg2 is connected with the input end of the multiplexer MUX4 and used for achieving multiply-accumulate operation of convolution operation. The convolution operation structure is suitable for all current convolutional neural network models, effectively reduces the dynamic power consumptionof global calculation on the premise of meeting the data parallelism degree to the maximum extent, is simple in control structure, and is very high in universality.

Description

technical field [0001] The invention belongs to the technical field of integrated circuit design and special hardware accelerator, and in particular relates to a convolution operation structure for reducing data migration and power consumption of a deep neural network. Background technique [0002] In recent years, with the important breakthroughs in speech and image recognition, deep neural networks (DNNs) have become the basis of many modern artificial intelligence applications. The superior performance of DNNs benefits from its ability to extract high-level features from large amounts of data to obtain effective representations of similar data. A common form of DNNs is Convolutional Neural Networks (CNNs), the main part of which is composed of multiple convolutional layers, each of which is a higher-dimensional abstraction of the input feature map (ifmap). Currently, CNNs The number of convolutional layers has evolved from the original 2-layer LeNet to the current 53-lay...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/063G06N3/04

CPCG06N3/063G06N3/045Y02D10/00

Inventor 娄冕苏若皓杨靓崔媛媛张海金郭娜娜刘思源黄九余田超

Owner XIAN MICROELECTRONICS TECH INST

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Convolution operation structure for reducing data migration and power consumption of deep neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology