Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for quantizing neural network on processing unit

A neural network and processing unit technology, applied in the computer field, can solve the problems of slow CPU processing of convolutional neural networks, increased time consumption, and inability to quantify splicing operators.

Pending Publication Date: 2022-06-24
ANHUI CAMBRICON INFORMATION TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above-mentioned quantization process has the following four disadvantages: 1. Due to the slow speed of the CPU processing the convolutional neural network, when the samples used in the quantization are too large, the time spent on the entire quantization will increase significantly; The large operator spliced ​​by several cnml operators and the operator implemented by Bang c language (generally defined by the user, not supported by the official framework) cannot run on the CPU, resulting in the failure of the entire quantization; 3. For multiple cnml For a large operator spliced ​​by operators, the operator to be quantified may be one of the cnml operators, and the relevant parameters cannot be obtained during the network operation process, which leads to the inability to quantify the spliced ​​operator; 4. Destruction of the framework structure, and Subsequent framework changes and quantitative tools also need to be modified, which increases maintenance costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for quantizing neural network on processing unit
  • Method and device for quantizing neural network on processing unit
  • Method and device for quantizing neural network on processing unit

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are part of the embodiments of the present disclosure, but not all of the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.

[0021] It should be understood that the terms "first", "second", "third" and "fourth" in the claims, description and drawings of the present disclosure are used to distinguish different objects, rather than to describe a specific order . The terms "comprising" and "comprising" as used in the specification and claims of this disclosure indicate the presence of the described features, integers, steps, operations, elements and / or components, but do not exclu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present disclosure describes a method, an electronic device and a computing device for quantizing a neural network on a processing unit, where the computing device may be included in a combined processing device, which may also include a universal interconnection interface and other processing devices. And the computing device interacts with other processing devices to jointly complete the computing operation specified by the user. The combined processing device can further comprise a storage device, and the storage device is connected with the computing device and the other processing devices and used for storing data of the computing device and the other processing devices.

Description

technical field [0001] The present disclosure relates to the field of computers, and more particularly, to a method of data quantification in neural networks. Background technique [0002] Int8 quantization can significantly reduce the space occupied by the neural network model and the bandwidth occupied by the neural network model when it runs. At present, the quantization process of convolutional neural network is carried out on the CPU: the network to be quantized is run on the CPU, and the required parameters are obtained by modifying the operator function to be quantized during the network operation process, and executed when the network is running. quantification process. Finally, the quantized convolutional neural network can be run on the MLU platform for application. However, the above quantization process has the following four disadvantages: 1. Since the CPU is very slow in processing the convolutional neural network, when the samples used in quantization are to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/02
CPCG06N3/02
Inventor 不公告发明人
Owner ANHUI CAMBRICON INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products