A kind of asynchronous processing method and device based on neural network accelerator

A neural network and asynchronous processing technology, applied in the field of deep learning, can solve the problems of complex deep learning network design, bloated code callback, poor flexibility, etc., to speed up technical problems of asynchronous operation, reduce CPU bandwidth, and improve utilization.

Active Publication Date: 2022-06-24
浙江芯劢微电子股份有限公司
View PDF17 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the development of artificial intelligence, the design of deep learning network is becoming more and more complex. Without neural network accelerator hardware, it cannot meet the needs of embedded performance.
In the security field, the main detection objects are people, vehicles, and objects. While detecting people, vehicles, and objects, it is necessary to track people, vehicles, and objects. Then it is necessary to integrate multiple deep learning algorithms to solve the detection and tracking of people. When a frame of image screen When there are many people, it is necessary to track them one by one, and then use the feature matching strategy to ensure that the same person is tracked; the existing invention patent with the publication number of CN111679815A discloses an asynchronous operation method and device, storage medium and related equipment. The invention patent solves the problem of multi-layer nesting, separates the main logic and processing logic, solves the problem of poor flexibility and small scope of application, an asynchronous processing method, device, computer equipment and storage medium for Lua voice, adopts the callback and Promise scheme, and aims at the traditional Asynchronous callback method solves the problem of bloated callback code

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A kind of asynchronous processing method and device based on neural network accelerator
  • A kind of asynchronous processing method and device based on neural network accelerator
  • A kind of asynchronous processing method and device based on neural network accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] An asynchronous method based on a neural network accelerator provided by an embodiment of the present application is applied to a face capture machine in the security field, and the method includes the following steps:

[0058] The test evaluates the performance of hard and soft cores, and makes a choice based on the flexibility of the current invention:

[0059] The first solution includes: when the face feature network is used, the hard core execution performance of the neural network accelerator is similar to the soft core CPU execution post-processing, and the feature matching performance is close. The main thread can be used to asynchronously request a post-processing feature matching, and then query the hard core to complete;

[0060] The second solution includes: when the facial feature network, neural network accelerator hard-core execution performance and soft-core CPU execution post-processing, feature matching performance is compared, the CPU execution time is...

Embodiment 2

[0069] An asynchronous method based on neural network accelerator provided by the embodiment of this application is to implement Example 2 of the present invention, which requires cooperation Figure 5 The hardware architecture of the implementation of the present invention, the hard core instruction Figure 5 Medium NPU network processing unit, soft core refers to CPU central processing unit;

[0070] The shared queue exists in the DDR controller, so that the soft core and the hard core can be accessed through the bus;

[0071] The main thread runs in the CPU soft core, sends asynchronous requests, starts the hard core, and at the same time the hard core executes, the soft core processes other services synchronously to achieve a parallel effect.

[0072] In particular, the processes described above with reference to the flowcharts may be implemented as computer software programs in accordance with the disclosed embodiments of the present invention. For example, embodiments ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an asynchronous processing method and device based on a neural network accelerator. The method includes: generating an asynchronous request through a main thread according to algorithmic business requirements, generating a corresponding handle number, and generating a set of command words according to the generated business request, Join the command word queue, query the command word queue, if there is only one set of command words at present, set the set of command words to the execution state, start the hard core of the neural network accelerator, otherwise set the set of command words to the initial state and return. After the execution of the hard core, a hardware interrupt is generated and a soft interrupt is started, and the command word queue is queried. If there is a command word, it is taken out from the queue, and whether it is in the initial state is judged. If yes, it is set to the execution state and the hard core is started. Character. The main thread queries whether the hard core of the operation is completed according to the handle number. The above operation realizes the asynchronous operation process of multiple network algorithms, and handles asynchronous operations through hard core interrupts, which accelerates network performance and reduces processor bandwidth occupation.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a processing method of a neural network accelerator, and an asynchronous processing method and device based on the neural network accelerator. Background technique [0002] With the development of artificial intelligence, the design of deep learning networks is becoming more and more complex. Without neural network accelerator hardware, it is impossible to meet the needs of embedded performance. In the field of security, the main detection objects are people, cars, and objects. At the same time of detection, people, cars, and objects need to be tracked. Therefore, multiple deep learning algorithms must be integrated to solve the problem of detecting people and tracking people. When there are multiple people, it needs to be tracked one by one, and then the feature matching strategy is used to ensure that the same person is tracked; the existing invention patent with publicat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48G06N3/063
CPCG06F9/4812G06F9/4881G06N3/063
Inventor 吴春选朱旭东
Owner 浙江芯劢微电子股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products