Asynchronous processing method and device based on neural network accelerator
A neural network and asynchronous processing technology, applied in the field of deep learning, can solve the problems of complex deep learning network design, bloated code callback, poor flexibility, etc., to speed up the technical problems of asynchronous operation, reduce CPU bandwidth, and improve the effect of utilization.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0057] An asynchronous method based on a neural network accelerator provided in an embodiment of the present application is applied to a face capture machine in the security field, and the method includes the following steps:
[0058] Test and evaluate the performance of hard core and soft core, and make a solution selection according to the flexibility of the current invention:
[0059] The first solution includes: when the face feature network, neural network accelerator hard-core execution performance and soft-core CPU perform post-processing, feature matching performance is close, the main thread can asynchronously request a post-processing feature matching, and then query the hard core to complete;
[0060] The second solution includes: when the execution performance of the hard core of the face feature network and the neural network accelerator is compared with the post-processing and feature matching performance of the soft core CPU, the CPU execution time is much longer...
Embodiment 2
[0069] An asynchronous method based on a neural network accelerator provided in the embodiment of the present application is required to cooperate with the implementation of Example 2 of the present invention. Figure 5 The implementation of the hardware architecture, the hard core finger described in the example of the present invention Figure 5 Medium NPU network processing unit, soft core refers to CPU central processing unit;
[0070] The shared queue exists in the DDR controller, so that the soft core and hard core can be jointly accessed through the bus;
[0071] The main thread runs in the CPU soft core, sends an asynchronous request, starts the hard core, and at the same time as the hard core executes, the soft core processes other services synchronously to achieve a parallel effect.
[0072] In particular, according to the disclosed embodiments of the present invention, the processes described above with reference to the flowcharts can be implemented as computer sof...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com