4-bit quantization method and system of neural network
A neural network and bit quantization technology, applied in the 4-bit quantization method and system field of neural network, can solve the problems of low quantization efficiency, achieve the effects of improving calculation speed, improving practicability, and saving training time
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0062] see figure 1 , figure 1 It is a schematic flowchart of a 4-bit quantization method for a neural network provided in the embodiment of the present application. Depend on figure 1 It can be seen that the 4-bit quantization method of the neural network in this embodiment mainly includes the following process:
[0063] S1: Load the pre-trained model of the neural network.
[0064] S2: In the pre-training model, count the initial value of satRelu of each saturated activation layer.
[0065] Specifically, step S2 includes:
[0066] S21: Replace all activation layers relu in the neural network with saturated activation layers satRelu.
[0067] S22: Obtain activation values of satRelu of each saturated activation layer according to the obtained command.
[0068] S23: According to the activation value, use the histogram to statistically distribute the data.
[0069] S24: Select the activation value located at 99.999% points in the histogram as the initial value of the p...
Embodiment 2
[0110] exist figure 1 On the basis of the illustrated embodiment see figure 2 , figure 2 It is a schematic structural diagram of a 4-bit quantization system of a neural network provided by the embodiment of the present application. Depend on figure 2 It can be seen that the 4-bit quantization system of the neural network in this embodiment mainly includes: a loading module, a statistical module, a retraining module, a judgment module and a conversion module.
[0111] Among them, the loading module is used to load the pre-training model of the neural network; the statistics module is used to count the initial value of satRelu of each saturated activation layer in the pre-training model; the retraining module is used to add pseudo-quantization nodes in the neural network , and use the initial value of satRelu to retrain the neural network to obtain a pseudo-quantized model; the judgment module is used to judge whether the accuracy of the pseudo-quantized model converges to...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com