Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Heterogeneous GPU distribution system and method for multiple deep learning tasks in distributed environment

A distributed environment and learning task technology, applied in the field of heterogeneous GPU allocation system, can solve the problems of not considering task characteristics and requirements, low GPU utilization rate, low execution efficiency of multiple deep learning tasks, etc., to save waiting for results Time, the effect of improving execution efficiency

Pending Publication Date: 2022-07-29
ZHEJIANG LAB
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, the traditional GPU allocation method of the deep learning training framework is generally to statically specify the GPU parameters when starting multi-tasks in the distributed environment, and the GPU selection parameters provided by the deep learning training framework are used to schedule tasks with different requirements to the corresponding GPU for in-depth processing. Learning and training; the deep learning training framework also provides the use of all available GPU allocation methods, which allows the batch data of each task to be allocated to all GPUs for deep learning training. Due to the strong computing power of the GPU, the amortized small batches are quickly trained Data, this GPU allocation scheme causes the GPU with strong computing power to be idle for a long time, and the impact is that the utilization rate of the GPU with strong computing power is not high
[0005] Because the traditional GPU allocation scheme for multiple deep learning tasks in a distributed environment does not consider the characteristics and requirements of the tasks, and does not make full use of heterogeneous GPU performance to meet the concurrent operation of different deep learning training tasks, making multiple deep learning tasks The overall execution efficiency is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Heterogeneous GPU distribution system and method for multiple deep learning tasks in distributed environment
  • Heterogeneous GPU distribution system and method for multiple deep learning tasks in distributed environment
  • Heterogeneous GPU distribution system and method for multiple deep learning tasks in distributed environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0062] According to the current heterogeneous GPU configuration, whether the GPU's memory capacity can hold the batch data volume of multiple tasks at the same time, and the execution efficiency of deep learning training for each task, etc. A heterogeneous GPU allocation method for multiple deep learning tasks in a multi-level environment, including the following steps:

[0063] S1, initialization of multiple deep learning training tasks; such as figure 2 It shows the data flow of multi-task deep learning training in a heterogeneous environment. The heterogeneity shown in the present invention is only different from the GPU. The original data stored in the bottom layer is transmitted to the DRAM cache of the heterogeneous environment. The time is roughly the same, and the present invention needs to optimize how the batch data for the memory cache is better distributed to the corresponding GPU. When the multi-task deep learning is initialized, the GPU Profile module collects ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of deep learning under artificial intelligence, and discloses a heterogeneous GPU distribution system and method for multiple deep learning tasks in a distributed environment, and the system comprises a GPU Profile module, a task information collection module, a GPU selection module and a deep learning training module. According to the heterogeneous GPU allocation method for the multiple deep learning tasks in the distributed environment, the GPUs with different computing capabilities can be allocated to the tasks with corresponding requirements, and the tasks with complex model levels and large batch data volume are adapted to the GPU with the best performance and the nodes with video memories sufficient for storage to run; tasks needing longer time to do deep learning training are accelerated, so that the multi-task execution efficiency in a heterogeneous environment is obviously improved; and when multiple deep learning tasks are executed concurrently, multiple deep learning can be integrally completed more quickly, so that the time for a programmer or a user to wait for a result can be saved.

Description

technical field [0001] The invention belongs to the field of deep learning under artificial intelligence, and in particular relates to a heterogeneous GPU allocation system and method for multiple deep learning tasks in a distributed environment. Background technique [0002] Today, deep neural networks are trained with large-scale data to obtain very accurate models, which promotes the continuous application of deep neural networks in the fields of image classification, speech recognition, and unmanned driving. These trends have led to more and more complex deep neural network models, and also prompted the emergence of devices to accelerate deep neural network training, such as GPUs, FPGAs, and TPUs. Therefore, how to use heterogeneous acceleration devices in a distributed environment more efficiently has gradually become an important hot issue. [0003] It has gradually become a common phenomenon for multiple tasks to perform deep learning training concurrently in a distr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06T1/20G06N3/04G06N3/08
CPCG06T1/20G06N3/08G06N3/045
Inventor 周方何水兵秦亦朱春节方启明曾令仿
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products