Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Convolutional neural network hardware accelerator for solidifying full network layer on reconfigurable platform

A convolutional neural network and hardware accelerator technology, applied in the field of convolutional neural network hardware acceleration, can solve problems such as reduced computing performance, insufficient on-chip computing power, unbalanced utilization of off-chip memory access bandwidth, etc., to alleviate parallel computing The effect of conflicting resources

Pending Publication Date: 2020-12-22
UNIV OF SCI & TECH OF CHINA
View PDF0 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Under the single-core hardware structure, different types of layers can only use the acceleration mode of time division multiplexing, resulting in insufficient utilization of the overall off-chip memory access bandwidth during the calculation process of the convolutional layer, and on-chip calculations due to memory access bottlenecks during the calculation process of the fully connected layer. Capabilities cannot be fully utilized, and the utilization of off-chip memory access bandwidth during acceleration is very unbalanced
Also taking Caffeine's deployment of VGG16 as an example, the fully connected layer, which accounts for less than 1% of the total calculation, reduces the overall calculation performance by about 20%, seriously affecting the hardware acceleration efficiency.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional neural network hardware accelerator for solidifying full network layer on reconfigurable platform
  • Convolutional neural network hardware accelerator for solidifying full network layer on reconfigurable platform
  • Convolutional neural network hardware accelerator for solidifying full network layer on reconfigurable platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0060] The invention is oriented to efficient hardware deployment of convolutional neural networks, combines reconfigurable computing technology with heterogeneous multi-core architecture, and systematically proposes a heterogeneous multi-core accelerator structure and acceleration method on a reconfigurable platform, effectively alleviating Hardware-Software Feature Mismatch in Hardware Acceleration of Convolutional Neural Networks. In the present invention, a heterogeneous multi-core accelerator deployment method that solidifies the entire network layer on-chip is proposed. By realizing the end-to-end mapping between hierarchical computing and hardware structure, the adaptability between software and hardware features is improved, and traditional volume The waste of a large number of hardware resources in the design of the Accumulative Neural Network Accelerator improves the utilization efficiency of computing resources.

[0061] The overall architecture of the on-chip solid...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a convolutional neural network hardware accelerator for solidifying a full network layer on a reconfigurable platform. The accelerator comprises a control module which is usedfor coordinating and controlling an acceleration process, including the initialization and synchronization of other modules on a chip, and starting the interaction of different types of data between each calculation core and an off-chip memory; a data transmission module which comprises a memory controller and a plurality of DMAs and is used for the data interaction between each on-chip data cacheand the off-chip memory; a calculation module which comprises a plurality of calculation cores for calculation, wherein the calculation cores are in one-to-one correspondence with different network layers of the convolutional neural network; wherein each calculation core is used as one stage of an assembly line, and all the calculation cores jointly form a complete coarse-grained assembly line structure, and each calculation core internally comprises a fine-grained computing pipeline. By implementing end-to-end mapping between hierarchical computing and a hardware structure, the adaptabilitybetween software and hardware features is improved, and the utilization efficiency of computing resources is improved.

Description

technical field [0001] The invention belongs to the technical field of convolutional neural network hardware acceleration, and in particular relates to a convolutional neural network hardware accelerator and an acceleration method for solidifying the entire network layer on a reconfigurable platform. Background technique [0002] With the increase of learning and classification capabilities, the deployment scale of convolutional neural networks in the cloud and terminals is increasing year by year. In order to solve more abstract and complex classification and learning problems in the real world, the scale of convolutional neural networks is increasing, and the computational complexity and data volume are also increasing dramatically. The Google Cat network system in 2012 had about 1 billion neuron connections. The VGG19 network model that appeared in 2014 has 140 million neuron connections, and a feedforward process requires nearly 40 billion operations. On the other hand...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/063G06N3/04G06N3/08
CPCG06N3/063G06N3/082G06N3/084G06N3/086G06N3/045G06N3/044
Inventor 宫磊王超朱宗卫李曦陈香兰周学海
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products