Asynchronous processing method and device based on neural network accelerator

A neural network and asynchronous processing technology, applied in the field of deep learning, can solve the problems of complex deep learning network design, bloated code callback, poor flexibility, etc., to speed up the technical problems of asynchronous operation, reduce CPU bandwidth, and improve the effect of utilization.

Active Publication Date: 2022-03-29
浙江芯劢微电子股份有限公司
View PDF17 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the development of artificial intelligence, the design of deep learning network is becoming more and more complex. Without neural network accelerator hardware, it cannot meet the needs of embedded performance.
In the security field, the main detection objects are people, vehicles, and objects. While detecting people, vehicles, and objects, it is necessary to track people, vehicles, and objects. Then it is necessary to integrate multiple deep learning algorithms to solve the detection and tracking of people. When a frame of image screen When there are many people, it is necessary to track them one by one, and then use the feature matching strategy to ensure that the same person is tracked; the existing invention patent with the publication number of CN111679815A discloses an asynchronous operation method and device, storage medium and related equipment. The invention patent solves the problem of multi-layer nesting, separates the main logic and processing logic, solves the problem of poor flexibility and small scope of application, an asynchronous processing method, device, computer equipment and storage medium for Lua voice, adopts the callback and Promise scheme, and aims at the traditional Asynchronous callback method solves the problem of bloated callback code

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Asynchronous processing method and device based on neural network accelerator
  • Asynchronous processing method and device based on neural network accelerator
  • Asynchronous processing method and device based on neural network accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] An asynchronous method based on a neural network accelerator provided in an embodiment of the present application is applied to a face capture machine in the security field, and the method includes the following steps:

[0058] Test and evaluate the performance of hard core and soft core, and make a solution selection according to the flexibility of the current invention:

[0059] The first solution includes: when the face feature network, neural network accelerator hard-core execution performance and soft-core CPU perform post-processing, feature matching performance is close, the main thread can asynchronously request a post-processing feature matching, and then query the hard core to complete;

[0060] The second solution includes: when the execution performance of the hard core of the face feature network and the neural network accelerator is compared with the post-processing and feature matching performance of the soft core CPU, the CPU execution time is much longer...

Embodiment 2

[0069] An asynchronous method based on a neural network accelerator provided in the embodiment of the present application is required to cooperate with the implementation of Example 2 of the present invention. Figure 5 The implementation of the hardware architecture, the hard core finger described in the example of the present invention Figure 5 Medium NPU network processing unit, soft core refers to CPU central processing unit;

[0070] The shared queue exists in the DDR controller, so that the soft core and hard core can be jointly accessed through the bus;

[0071] The main thread runs in the CPU soft core, sends an asynchronous request, starts the hard core, and at the same time as the hard core executes, the soft core processes other services synchronously to achieve a parallel effect.

[0072] In particular, according to the disclosed embodiments of the present invention, the processes described above with reference to the flowcharts can be implemented as computer sof...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an asynchronous processing method and device based on a neural network accelerator, and the method comprises the steps: generating an asynchronous request through a main thread according to an algorithm service demand, generating a corresponding handle number, generating a group of command words according to the generated service request, adding a command word queue, and querying the command word queue, and if only one group of command words exist currently, setting the group of command words to be in an execution state, starting a neural network accelerator hardcore, otherwise, setting the group of command words to be in an initial state and returning. And after the execution of the hard core is finished, generating hardware interruption, starting soft interruption, querying a command word queue, taking out command words from the queue if the command words exist, judging whether the command words are in an initial state, if so, setting the command words to be in an execution state and starting the hard core, and otherwise, deleting the group of command words from the queue. And the main thread queries whether the hard core of the operation is completed or not according to the handle number. According to the operation, the asynchronous operation process of a plurality of network algorithms is realized, the asynchronous operation is processed through hard core interruption, the network performance is accelerated, and the bandwidth occupied by a processor is reduced.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a processing method of a neural network accelerator, and an asynchronous processing method and device based on a neural network accelerator. Background technique [0002] With the development of artificial intelligence, the design of deep learning network is becoming more and more complex. Without neural network accelerator hardware, it cannot meet the requirements of embedded performance. In the security field, the main detection objects are people, vehicles, and objects. While detecting people, vehicles, and objects, it is necessary to track people, vehicles, and objects. Then it is necessary to integrate multiple deep learning algorithms to solve the detection and tracking of people. When a frame of image screen When there are many people, it is necessary to track them one by one, and then use the feature matching strategy to ensure that the same person is tracked; the e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48G06N3/063
CPCG06F9/4812G06F9/4881G06N3/063
Inventor 吴春选朱旭东
Owner 浙江芯劢微电子股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products