Sparse accelerator applied to on-chip training

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An accelerator, sparse technology, applied in biological neural network models, neural architectures, neural learning methods, etc., can solve the problems of ineffective operations on-chip training, inability to guarantee on-chip training, etc., to improve hardware utilization and accurate elimination.

Pending Publication Date: 2022-05-13

NANJING UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] This application provides a sparse accelerator applied to on-chip training, which is used to solve the technical problem that after the acceleration of the existing accelerator, all invalid operations in the on-chip training cannot be eliminated, and the most efficient implementation of on-chip training on the terminal device cannot be guaranteed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0057] In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

[0058] The terms used in the following examples are for the purpose of describing particular examples only, and are not intended to limit the application. As used in the specification and appended claims of this application, the singular expressions "a", "an", "said", "above", "the" and "this" are intended to also Expressions such as "one or more" are included unless the context clearly dictates otherwise. It should also be understood that in the following embodiments of the present application, "at least one" and "one or more" refer to one, two or more, and "multiple" refers to two or more. The term "and / or" is used to describe the relationship between associated objects, indicating that there may be three relationships; for ex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

According to the sparse accelerator applied to on-chip training provided by the embodiment of the invention, a plurality of input numerical values in the input numerical value buffer module, a plurality of reference numerical values in the reference numerical value buffer module and a mask in the mask buffer module are dynamically adjusted in different acceleration stages; the method comprises the following steps of: performing primary screening on invalid operations by a coarse-grained unit in a coarse-grained manner, and further screening the invalid operations by a fine-grained unit included in each processing unit in a processing module in a fine-grained manner, thereby eliminating all invalid operations in three training stages of on-chip training; in addition, the plurality of computing cores can perform parallel acceleration processing, and the plurality of processing units included in the processing modules in the plurality of computing cores can also perform parallel acceleration processing, so that the hardware utilization rate of the sparse accelerator is further improved. Therefore, the sparse accelerator applied to on-chip training can efficiently and accurately eliminate all invalid operations in three stages of on-chip training.

Description

technical field [0001] This application relates to the field of computer and electronic information technology, in particular to a sparse accelerator applied to on-chip training. Background technique [0002] In recent years, Convolutional Neural Networks (CNN) have performed well in the fields of computer vision, speech recognition, and natural language processing. In order to improve the accuracy of recognition, it is necessary to train the CNN model on-chip. Fine-tune the CNN model with the user data on the network, and then improve the training accuracy of the CNN model when it is used. [0003] The on-chip training process includes three stages, namely the forward propagation (FP) stage, the backward propagation (BP) stage and the weight gradient calculation (weight gradient, WG) stage. In the FP stage, the previous layer The activation value of the current layer is used as the input activation value of the current layer, combined with the convolution kernel weight cor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/063G06N3/08

CPCG06N3/063G06N3/082G06N3/084G06N3/045

Inventor 王中风黄健鲁金铭

Owner NANJING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Sparse accelerator applied to on-chip training

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology