Regional parallel loading device and method for tensor data

A regionalization and data technology, applied in electrical digital data processing, special data processing applications, design optimization/simulation, etc., can solve the problem of excessive design and implementation complexity, improve practical application efficiency, simplify space complexity, simplify The effect of software layout data

Pending Publication Date: 2021-07-09
苏州芯启微电子科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The invention of patent 4 is too coupled with the design of the central processing unit, and the design and implementation complexity is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Regional parallel loading device and method for tensor data
  • Regional parallel loading device and method for tensor data
  • Regional parallel loading device and method for tensor data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0035] figure 1 It is a data flow and device structure diagram of a region parallel data loading method in a deep convolutional neural network hardware accelerator according to the present invention. The device includes a parallel input register array (IRA) 202 and a parallel input data access engine (IDE) 203 . The figure also illustrates a simplified connection design between the device of the present invention and the parallel hardware computing element array (PEA) 1 . The device 1 is composed of several parallel hardware computing units (PE) 101 .

[0036] The parallel input register array (IRA) 202 is composed of a specific number of registers, which is used to provide a fast register area for data rearrangement, which simplifies the difficulty of input data arrangement; the parallel input register array can be accessed repeatedly, when the d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a regional parallel loading device and method for tensor data. The device comprises a parallel input register array which is used for providing a rapid register area for data rearrangement for an input feature map in an input cache; and a parallel input data access engine used for carrying out regionalized parallel access on the data in the parallel input register array. After the design is adopted, the structure of a connecting circuit can be simplified, so that the area and the power consumption of a chip are optimized.

Description

technical field [0001] The invention belongs to the field of computer hardware and artificial neural network algorithm deployment hardware acceleration, and the field of digital integrated circuit design, and specifically relates to a key processing device for input data of a deep convolutional neural network hardware acceleration chip, and a design method thereof. Background technique [0002] The deep convolutional neural network algorithm is composed of multiple layers of specific neuron algorithm layers and hidden layers, mainly including convolutional layers, and the main operator is convolution calculation of matrix or vector. The main characteristics of this computing task are the large amount of input data, the coupling of the input data with spatial feature information, and the data calculated by each convolution often overlaps with the already calculated data, and the input data is often data in tensor format. The required calculation data is extracted with a certa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/373G06F30/27G06N3/04
CPCG06F30/373G06F30/27G06N3/045
Inventor 杨旭光
Owner 苏州芯启微电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products