Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory

A main memory, deep convolution technology, applied in biological neural network models, neural architectures, physical implementations, etc., can solve the waste of data reuse accelerator resources, complex data segmentation and arrangement methods, and excessive coupling of central processing units. It can simplify the space complexity, improve the utilization of computing resources and memory bandwidth, and simplify the connection complexity.

Inactive Publication Date: 2020-10-16
北京芯启科技有限公司
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the inventions described in Patent Documents 1 and 3, accelerator resources are wasted due to different neural network algorithm layer sizes and data reusability, so that it is necessary to cooperate with other heterogeneous processors to help solve data-related problems, or It is to rely on deeper sub-micron high-cost advanced technology to improve performance; the storage method described in patent 3 needs to back up more data, resulting in too large a buffer size; the method of patent 2 uses reconfigurable computing ideas, although it pays great attention to saving resources and waste However, its data segmentation and arrangement methods are very complicated, and compilers need to be deployed in conjunction with advanced computing tasks to assist applications; the invention of patent 4 is too coupled with the design of the central processing unit, and the design and implementation complexity is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory
  • Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory
  • Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0035] figure 1 It is a structural diagram of a data loading device combined with a main memory of the present invention, the data loading device 2 includes:

[0036] Tensor-type input cache random access controller 205 performs fusion, arrangement and data format conversion on the input data from main memory 6 or / and other memories, and then distributes them to the partitioned areas of input cache unit 201. The working mode Can be reconfigured by software;

[0037] The divisible input cache unit 201 is the local cache of the data loading device described in the present invention. It is composed of multiple storage pages. The design and storage method correspond to the dimensions of the input data and the parallel input register array 202, and support the software Data format changes caused by reconfiguration;

[0038] The tensor data loading de...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a hardware circuit design and method of a data loading device combined with a main memory. The hardware circuit design and method are used for deep convolutional neural network calculation acceleration. According to the device, a cache structure is specifically designed and comprises input cache and control, wherein a macro block segmentation method is applied to input ofa main memory or / and other memories, and regional data sharing and tensor data fusion and distribution are achieved; a parallel input register array for converting the data segmentation pieces input into the cache; and a tensor type data loading unit that is connected with the output of the input cache and the input of the parallel input register array. The design simplifies an address decoding circuit, saves area and power consumption, and does not influence high bandwidth of data. The hardware device and the data processing method provided by the invention comprise a transformation method, amacro block segmentation method and an addressing method for the input data, so that the requirement of carrying out algorithm acceleration by limited hardware resources is met, and the address management complexity is reduced.

Description

technical field [0001] The invention belongs to the field of computer hardware and artificial neural network algorithm deployment hardware acceleration, and the field of digital integrated circuit design, and specifically relates to a method for designing an architecture of an input system of a deep convolutional neural network hardware acceleration chip, and a device thereof. Background technique [0002] The deep convolutional neural network algorithm is composed of multiple layers of specific neuron algorithm layers and hidden layers, mainly including convolutional layers, and the main operator is convolution calculation of matrix or vector. The main characteristics of this computing task are the large amount of input data, the coupling of the input data with spatial feature information, and the data calculated by each convolution often overlaps with the already calculated data, and the input data is often data in tensor format. The required calculation data is extracted ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063
CPCG06N3/063G06N3/045
Inventor 林森何一波李珏
Owner 北京芯启科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products