Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A main memory, deep convolution technology, applied in biological neural network models, neural architectures, physical implementations, etc., can solve the waste of data reuse accelerator resources, complex data segmentation and arrangement methods, and excessive coupling of central processing units. It can simplify the space complexity, improve the utilization of computing resources and memory bandwidth, and simplify the connection complexity.

Inactive Publication Date: 2020-10-16

北京芯启科技有限公司

View PDF3 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the inventions described in Patent Documents 1 and 3, accelerator resources are wasted due to different neural network algorithm layer sizes and data reusability, so that it is necessary to cooperate with other heterogeneous processors to help solve data-related problems, or It is to rely on deeper sub-micron high-cost advanced technology to improve performance; the storage method described in patent 3 needs to back up more data, resulting in too large a buffer size; the method of patent 2 uses reconfigurable computing ideas, although it pays great attention to saving resources and waste However, its data segmentation and arrangement methods are very complicated, and compilers need to be deployed in conjunction with advanced computing tasks to assist applications; the invention of patent 4 is too coupled with the design of the central processing unit, and the design and implementation complexity is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] The present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0035] figure 1 It is a structural diagram of a data loading device combined with a main memory of the present invention, the data loading device 2 includes:

[0036] Tensor-type input cache random access controller 205 performs fusion, arrangement and data format conversion on the input data from main memory 6 or / and other memories, and then distributes them to the partitioned areas of input cache unit 201. The working mode Can be reconfigured by software;

[0037] The divisible input cache unit 201 is the local cache of the data loading device described in the present invention. It is composed of multiple storage pages. The design and storage method correspond to the dimensions of the input data and the parallel input register array 202, and support the software Data format changes caused by reconfiguration;

[0038] The tensor data loading de...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a hardware circuit design and method of a data loading device combined with a main memory. The hardware circuit design and method are used for deep convolutional neural network calculation acceleration. According to the device, a cache structure is specifically designed and comprises input cache and control, wherein a macro block segmentation method is applied to input ofa main memory or / and other memories, and regional data sharing and tensor data fusion and distribution are achieved; a parallel input register array for converting the data segmentation pieces input into the cache; and a tensor type data loading unit that is connected with the output of the input cache and the input of the parallel input register array. The design simplifies an address decoding circuit, saves area and power consumption, and does not influence high bandwidth of data. The hardware device and the data processing method provided by the invention comprise a transformation method, amacro block segmentation method and an addressing method for the input data, so that the requirement of carrying out algorithm acceleration by limited hardware resources is met, and the address management complexity is reduced.

Description

technical field [0001] The invention belongs to the field of computer hardware and artificial neural network algorithm deployment hardware acceleration, and the field of digital integrated circuit design, and specifically relates to a method for designing an architecture of an input system of a deep convolutional neural network hardware acceleration chip, and a device thereof. Background technique [0002] The deep convolutional neural network algorithm is composed of multiple layers of specific neuron algorithm layers and hidden layers, mainly including convolutional layers, and the main operator is convolution calculation of matrix or vector. The main characteristics of this computing task are the large amount of input data, the coupling of the input data with spatial feature information, and the data calculated by each convolution often overlaps with the already calculated data, and the input data is often data in tensor format. The required calculation data is extracted ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/063

CPCG06N3/063G06N3/045

Inventor 林森何一波李珏

Owner 北京芯启科技有限公司

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology